手动从 CI 中移除工件
Remove artifacts from CI manually
我在 gitlab.com 有一个使用 CI 功能的私有存储库。一些 CI 作业创建存储的工件文件。我刚刚通过将此添加到 CI 配置来实现在一天后自动删除工件:
expire_in: 1 day
效果很好 - 但是,旧工件不会被删除(如预期的那样)。所以我的问题是:
如何删除旧工件或未过期的工件? (在 gitlab.com,无法直接访问服务器)
我在 GitLab 8.17 上,我能够通过导航到服务器本身的存储目录来删除特定作业的工件,默认路径是:
/var/opt/gitlab/gitlab-rails/shared/artifacts/<year_month>/<project_id?>/<jobid>
删除作业的整个文件夹或只是内容,从 GitLab 管道页面中消失工件视图。
可以按照文档中的描述更改存储路径:
https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/administration/job_artifacts.md#storing-job-artifacts
您可以使用使用 API 的 GitLab REST API to delete the artifacts from the jobs if you don't have direct access to the server. Here's a sample curl script:
#!/bin/bash
# project_id, find it here: https://gitlab.com/[organization name]/[repository name]/edit inside the "General project settings" tab
project_id="3034900"
# token, find it here: https://gitlab.com/profile/personal_access_tokens
token="Lifg_azxDyRp8eyNFRfg"
server="gitlab.com"
# go to https://gitlab.com/[organization name]/[repository name]/-/jobs
# then open JavaScript console
# copy/paste => copy(_.uniq($('.ci-status').map((x, e) => /([0-9]+)/.exec(e.href)).toArray()).join(' '))
# press enter, and then copy the result here :
# repeat for every page you want
job_ids=(48875658 48874137 48873496 48872419)
for job_id in ${job_ids[@]};
do
URL="https://$server/api/v4/projects/$project_id/jobs/$job_id/erase"
echo "$URL"
curl --request POST --header "PRIVATE-TOKEN:${token}" "$URL"
echo "\n"
done
根据the documentation,删除整个作业日志(点击垃圾桶)也会删除工件。
建立在@David 的回答之上,
@Philipp 指出现在有一个 api 端点可以只删除作业工件而不是整个作业。
您可以直接在控制台中运行此脚本,或在node.js中使用node-fetch到运行。
//Go to: https://gitlab.com/profile/personal_access_tokens
const API_KEY = "API_KEY";
//You can find project id inside the "General project settings" tab
const PROJECT_ID = 12345678;
const PROJECT_URL = "https://gitlab.com/api/v4/projects/" + PROJECT_ID + "/"
let jobs = [];
for(let i = 0, currentJobs = []; i == 0 || currentJobs.length > 0; i++){
currentJobs = await sendApiRequest(
PROJECT_URL + "jobs/?per_page=100&page=" + (i + 1)
).then(e => e.json());
jobs = jobs.concat(currentJobs);
}
//skip jobs without artifacts
jobs = jobs.filter(e => e.artifacts);
//keep the latest build.
jobs.shift();
for(let job of jobs)
await sendApiRequest(
PROJECT_URL + "jobs/" + job.id + "/artifacts",
{method: "DELETE"}
);
async function sendApiRequest(url, options = {}){
if(!options.headers)
options.headers = {};
options.headers["PRIVATE-TOKEN"] = API_KEY;
return fetch(url, options);
}
如果您不小心删除了所有作业(认为工件会消失,但实际上并没有),那么 brute-forcing 循环范围是什么替代方案?
我有这段代码,它可以对一系列数字进行暴力破解。但是因为我使用 gitlab.com public 跑步者,所以它是 long-range
# project_id, find it here: https://gitlab.com/[organization name]/[repository name]/edit inside the "General project settings" tab
project_id="xxxxxx" #
# token, find it here: https://gitlab.com/profile/personal_access_tokens
token="yyyyy"
server="gitlab.com"
# Get a range of the oldest known job and the lastet known one, then bruteforce. Used in the case when you deleted pipelines and can't retrive Job Ids.
#
for (( job_id = 59216999; job_id <= 190239535; job_id++ )) do
echo "$job_id"
echo Job ID being deleted is "$job_id"
curl --request POST --header "PRIVATE-TOKEN:${token}" "https://${server}/api/v4/projects/${project_id}/jobs/${job_id}/erase"
echo -en '\n'
echo -en '\n'
done
这个 Python 解决方案适用于 GitLab 13.11.3。
#!/bin/python3
# delete_artifacts.py
import json
import requests
# adapt accordingly
base_url='https://gitlab.example.com'
project_id='1234'
access_token='123412341234'
#
# Get Version Tested with Version 13.11.3
# cf. https://docs.gitlab.com/ee/api/version.html#version-api
#
print(f'GET /version')
x= (requests.get(f"{base_url}/api/v4/version", headers = {"PRIVATE-TOKEN": access_token }))
print(x)
data=json.loads(x.text)
print(f'Using GitLab version {data["version"]}. Tested with 13.11.3')
#
# List project jobs
# cf. https://docs.gitlab.com/ee/api/jobs.html#list-project-jobs
#
request_str=f'projects/{project_id}/jobs'
url=f'{base_url}/api/v4/{request_str}'
print(f'GET /{request_str}')
x= (requests.get(url, headers = {"PRIVATE-TOKEN": access_token }))
print(x)
data=json.loads(x.text)
input('WARNING: This will delete all artifacts. Job logs will remain be available. Press Enter to continue...' )
#
# Delete job artifacts
# cf. https://docs.gitlab.com/ee/api/job_artifacts.html#delete-artifacts
#
for entry in data:
request_str=f'projects/{project_id}/jobs/{entry["id"]}/artifacts'
url=f'{base_url}/api/v4/{request_str}'
print(f'DELETE /{request_str}')
x = requests.delete(url, headers = {"PRIVATE-TOKEN": access_token })
print(x)
我会保留更新版本 here。欢迎随时联系我们并改进代码。
API 调用应该更容易编写脚本,GitLab 14.7(2022 年 1 月)现在提供:
Bulk delete artifacts with the API
While a good strategy for managing storage consumption is to set regular expiration policies for artifacts, sometimes you need to reduce items in storage right away.
Previously, you might have used a script to automate the tedious task of deleting artifacts one by one with API calls, but now you can use a new API endpoint to bulk delete job artifacts quickly and easily.
See Documentation, Issue 223793 and Merge Request 75488.
curl --request DELETE --header "PRIVATE-TOKEN: <your_access_token>" \
"https://gitlab.example.com/api/v4/projects/1/artifacts"
我在 gitlab.com 有一个使用 CI 功能的私有存储库。一些 CI 作业创建存储的工件文件。我刚刚通过将此添加到 CI 配置来实现在一天后自动删除工件:
expire_in: 1 day
效果很好 - 但是,旧工件不会被删除(如预期的那样)。所以我的问题是:
如何删除旧工件或未过期的工件? (在 gitlab.com,无法直接访问服务器)
我在 GitLab 8.17 上,我能够通过导航到服务器本身的存储目录来删除特定作业的工件,默认路径是:
/var/opt/gitlab/gitlab-rails/shared/artifacts/<year_month>/<project_id?>/<jobid>
删除作业的整个文件夹或只是内容,从 GitLab 管道页面中消失工件视图。
可以按照文档中的描述更改存储路径:
https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/administration/job_artifacts.md#storing-job-artifacts
您可以使用使用 API 的 GitLab REST API to delete the artifacts from the jobs if you don't have direct access to the server. Here's a sample curl script:
#!/bin/bash
# project_id, find it here: https://gitlab.com/[organization name]/[repository name]/edit inside the "General project settings" tab
project_id="3034900"
# token, find it here: https://gitlab.com/profile/personal_access_tokens
token="Lifg_azxDyRp8eyNFRfg"
server="gitlab.com"
# go to https://gitlab.com/[organization name]/[repository name]/-/jobs
# then open JavaScript console
# copy/paste => copy(_.uniq($('.ci-status').map((x, e) => /([0-9]+)/.exec(e.href)).toArray()).join(' '))
# press enter, and then copy the result here :
# repeat for every page you want
job_ids=(48875658 48874137 48873496 48872419)
for job_id in ${job_ids[@]};
do
URL="https://$server/api/v4/projects/$project_id/jobs/$job_id/erase"
echo "$URL"
curl --request POST --header "PRIVATE-TOKEN:${token}" "$URL"
echo "\n"
done
根据the documentation,删除整个作业日志(点击垃圾桶)也会删除工件。
建立在@David 的回答之上, @Philipp 指出现在有一个 api 端点可以只删除作业工件而不是整个作业。
您可以直接在控制台中运行此脚本,或在node.js中使用node-fetch到运行。
//Go to: https://gitlab.com/profile/personal_access_tokens
const API_KEY = "API_KEY";
//You can find project id inside the "General project settings" tab
const PROJECT_ID = 12345678;
const PROJECT_URL = "https://gitlab.com/api/v4/projects/" + PROJECT_ID + "/"
let jobs = [];
for(let i = 0, currentJobs = []; i == 0 || currentJobs.length > 0; i++){
currentJobs = await sendApiRequest(
PROJECT_URL + "jobs/?per_page=100&page=" + (i + 1)
).then(e => e.json());
jobs = jobs.concat(currentJobs);
}
//skip jobs without artifacts
jobs = jobs.filter(e => e.artifacts);
//keep the latest build.
jobs.shift();
for(let job of jobs)
await sendApiRequest(
PROJECT_URL + "jobs/" + job.id + "/artifacts",
{method: "DELETE"}
);
async function sendApiRequest(url, options = {}){
if(!options.headers)
options.headers = {};
options.headers["PRIVATE-TOKEN"] = API_KEY;
return fetch(url, options);
}
如果您不小心删除了所有作业(认为工件会消失,但实际上并没有),那么 brute-forcing 循环范围是什么替代方案?
我有这段代码,它可以对一系列数字进行暴力破解。但是因为我使用 gitlab.com public 跑步者,所以它是 long-range
# project_id, find it here: https://gitlab.com/[organization name]/[repository name]/edit inside the "General project settings" tab
project_id="xxxxxx" #
# token, find it here: https://gitlab.com/profile/personal_access_tokens
token="yyyyy"
server="gitlab.com"
# Get a range of the oldest known job and the lastet known one, then bruteforce. Used in the case when you deleted pipelines and can't retrive Job Ids.
#
for (( job_id = 59216999; job_id <= 190239535; job_id++ )) do
echo "$job_id"
echo Job ID being deleted is "$job_id"
curl --request POST --header "PRIVATE-TOKEN:${token}" "https://${server}/api/v4/projects/${project_id}/jobs/${job_id}/erase"
echo -en '\n'
echo -en '\n'
done
这个 Python 解决方案适用于 GitLab 13.11.3。
#!/bin/python3
# delete_artifacts.py
import json
import requests
# adapt accordingly
base_url='https://gitlab.example.com'
project_id='1234'
access_token='123412341234'
#
# Get Version Tested with Version 13.11.3
# cf. https://docs.gitlab.com/ee/api/version.html#version-api
#
print(f'GET /version')
x= (requests.get(f"{base_url}/api/v4/version", headers = {"PRIVATE-TOKEN": access_token }))
print(x)
data=json.loads(x.text)
print(f'Using GitLab version {data["version"]}. Tested with 13.11.3')
#
# List project jobs
# cf. https://docs.gitlab.com/ee/api/jobs.html#list-project-jobs
#
request_str=f'projects/{project_id}/jobs'
url=f'{base_url}/api/v4/{request_str}'
print(f'GET /{request_str}')
x= (requests.get(url, headers = {"PRIVATE-TOKEN": access_token }))
print(x)
data=json.loads(x.text)
input('WARNING: This will delete all artifacts. Job logs will remain be available. Press Enter to continue...' )
#
# Delete job artifacts
# cf. https://docs.gitlab.com/ee/api/job_artifacts.html#delete-artifacts
#
for entry in data:
request_str=f'projects/{project_id}/jobs/{entry["id"]}/artifacts'
url=f'{base_url}/api/v4/{request_str}'
print(f'DELETE /{request_str}')
x = requests.delete(url, headers = {"PRIVATE-TOKEN": access_token })
print(x)
我会保留更新版本 here。欢迎随时联系我们并改进代码。
API 调用应该更容易编写脚本,GitLab 14.7(2022 年 1 月)现在提供:
Bulk delete artifacts with the API
While a good strategy for managing storage consumption is to set regular expiration policies for artifacts, sometimes you need to reduce items in storage right away.
Previously, you might have used a script to automate the tedious task of deleting artifacts one by one with API calls, but now you can use a new API endpoint to bulk delete job artifacts quickly and easily.
See Documentation, Issue 223793 and Merge Request 75488.
curl --request DELETE --header "PRIVATE-TOKEN: <your_access_token>" \
"https://gitlab.example.com/api/v4/projects/1/artifacts"