Rework repository archive (#14723)

* Use storage to store archive files

* Fix backend lint

* Add archiver table on database

* Finish archive download

* Fix test

* Add database migrations

* Add status for archiver

* Fix lint

* Add queue

* Add doctor to check and delete old archives

* Improve archive queue

* Fix tests

* improve archive storage

* Delete repo archives

* Add missing fixture

* fix fixture

* Fix fixture

* Fix test

* Fix archiver cleaning

* Fix bug

* Add docs for repository archive storage

* remove repo-archive configuration

* Fix test

* Fix test

* Fix lint

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
tokarchuk/v1.17
Lunny Xiao 3 years ago committed by GitHub
parent c9c7afda1a
commit b223d36195
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
  1. 10
      custom/conf/app.example.ini
  2. 17
      docs/content/doc/advanced/config-cheat-sheet.en-us.md
  3. 15
      docs/content/doc/advanced/config-cheat-sheet.zh-cn.md
  4. 1
      integrations/gitea-repositories-meta/user27/repo49.git/refs/heads/test/archive
  5. 1
      models/fixtures/repo_archiver.yml
  6. 2
      models/migrations/migrations.go
  7. 1
      models/migrations/v181.go
  8. 22
      models/migrations/v185.go
  9. 1
      models/models.go
  10. 97
      models/repo.go
  11. 86
      models/repo_archiver.go
  12. 2
      models/unit_tests.go
  13. 15
      modules/context/context.go
  14. 59
      modules/doctor/checkOldArchives.go
  15. 31
      modules/git/repo_archive.go
  16. 6
      modules/setting/repository.go
  17. 4
      modules/setting/storage.go
  18. 15
      modules/storage/storage.go
  19. 3
      routers/api/v1/repo/file.go
  20. 26
      routers/common/repo.go
  21. 4
      routers/init.go
  22. 114
      routers/web/repo/repo.go
  23. 3
      routers/web/web.go
  24. 394
      services/archiver/archiver.go
  25. 159
      services/archiver/archiver_test.go

@ -2048,6 +2048,16 @@ PATH =
;; storage type ;; storage type
;STORAGE_TYPE = local ;STORAGE_TYPE = local
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; settings for repository archives, will override storage setting
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;[storage.repo-archive]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; storage type
;STORAGE_TYPE = local
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; lfs storage will override storage ;; lfs storage will override storage

@ -995,6 +995,23 @@ MINIO_USE_SSL = false
And used by `[attachment]`, `[lfs]` and etc. as `STORAGE_TYPE`. And used by `[attachment]`, `[lfs]` and etc. as `STORAGE_TYPE`.
## Repository Archive Storage (`storage.repo-archive`)
Configuration for repository archive storage. It will inherit from default `[storage]` or
`[storage.xxx]` when set `STORAGE_TYPE` to `xxx`. The default of `PATH`
is `data/repo-archive` and the default of `MINIO_BASE_PATH` is `repo-archive/`.
- `STORAGE_TYPE`: **local**: Storage type for repo archive, `local` for local disk or `minio` for s3 compatible object storage service or other name defined with `[storage.xxx]`
- `SERVE_DIRECT`: **false**: Allows the storage driver to redirect to authenticated URLs to serve files directly. Currently, only Minio/S3 is supported via signed URLs, local does nothing.
- `PATH`: **./data/repo-archive**: Where to store archive files, only available when `STORAGE_TYPE` is `local`.
- `MINIO_ENDPOINT`: **localhost:9000**: Minio endpoint to connect only available when `STORAGE_TYPE` is `minio`
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID to connect only available when `STORAGE_TYPE` is `minio`
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey to connect only available when `STORAGE_TYPE is` `minio`
- `MINIO_BUCKET`: **gitea**: Minio bucket to store the lfs only available when `STORAGE_TYPE` is `minio`
- `MINIO_LOCATION`: **us-east-1**: Minio location to create bucket only available when `STORAGE_TYPE` is `minio`
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path on the bucket only available when `STORAGE_TYPE` is `minio`
- `MINIO_USE_SSL`: **false**: Minio enabled ssl only available when `STORAGE_TYPE` is `minio`
## Other (`other`) ## Other (`other`)
- `SHOW_FOOTER_BRANDING`: **false**: Show Gitea branding in the footer. - `SHOW_FOOTER_BRANDING`: **false**: Show Gitea branding in the footer.

@ -382,6 +382,21 @@ MINIO_USE_SSL = false
然后你在 `[attachment]`, `[lfs]` 等中可以把这个名字用作 `STORAGE_TYPE` 的值。 然后你在 `[attachment]`, `[lfs]` 等中可以把这个名字用作 `STORAGE_TYPE` 的值。
## Repository Archive Storage (`storage.repo-archive`)
Repository archive 的存储配置。 如果 `STORAGE_TYPE` 为空,则此配置将从 `[storage]` 继承。如果不为 `local` 或者 `minio` 而为 `xxx`, 则从 `[storage.xxx]` 继承。当继承时, `PATH` 默认为 `data/repo-archive`,`MINIO_BASE_PATH` 默认为 `repo-archive/`
- `STORAGE_TYPE`: **local**: Repository archive 的存储类型,`local` 将存储到磁盘,`minio` 将存储到 s3 兼容的对象服务。
- `SERVE_DIRECT`: **false**: 允许直接重定向到存储系统。当前,仅 Minio/S3 是支持的。
- `PATH`: 存放 Repository archive 上传的文件的地方,默认是 `data/repo-archive`
- `MINIO_ENDPOINT`: **localhost:9000**: Minio 地址,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_ACCESS_KEY_ID`: Minio accessKeyID,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_SECRET_ACCESS_KEY`: Minio secretAccessKey,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_BUCKET`: **gitea**: Minio bucket,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_LOCATION`: **us-east-1**: Minio location ,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_BASE_PATH`: **repo-archive/**: Minio base path ,仅当 `STORAGE_TYPE``minio` 时有效。
- `MINIO_USE_SSL`: **false**: Minio 是否启用 ssl ,仅当 `STORAGE_TYPE``minio` 时有效。
## Other (`other`) ## Other (`other`)
- `SHOW_FOOTER_BRANDING`: 为真则在页面底部显示Gitea的字样。 - `SHOW_FOOTER_BRANDING`: 为真则在页面底部显示Gitea的字样。

@ -319,6 +319,8 @@ var migrations = []Migration{
NewMigration("Create PushMirror table", createPushMirrorTable), NewMigration("Create PushMirror table", createPushMirrorTable),
// v184 -> v185 // v184 -> v185
NewMigration("Rename Task errors to message", renameTaskErrorsToMessage), NewMigration("Rename Task errors to message", renameTaskErrorsToMessage),
// v185 -> v186
NewMigration("Add new table repo_archiver", addRepoArchiver),
} }
// GetCurrentDBVersion returns the current db version // GetCurrentDBVersion returns the current db version

@ -1,3 +1,4 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style // Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file. // license that can be found in the LICENSE file.

@ -0,0 +1,22 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package migrations
import (
"xorm.io/xorm"
)
func addRepoArchiver(x *xorm.Engine) error {
// RepoArchiver represents all archivers
type RepoArchiver struct {
ID int64 `xorm:"pk autoincr"`
RepoID int64 `xorm:"index unique(s)"`
Type int `xorm:"unique(s)"`
Status int
CommitID string `xorm:"VARCHAR(40) unique(s)"`
CreatedUnix int64 `xorm:"INDEX NOT NULL created"`
}
return x.Sync2(new(RepoArchiver))
}

@ -136,6 +136,7 @@ func init() {
new(RepoTransfer), new(RepoTransfer),
new(IssueIndex), new(IssueIndex),
new(PushMirror), new(PushMirror),
new(RepoArchiver),
) )
gonicNames := []string{"SSL", "UID"} gonicNames := []string{"SSL", "UID"}

@ -1587,6 +1587,22 @@ func DeleteRepository(doer *User, uid, repoID int64) error {
return err return err
} }
// Remove archives
var archives []*RepoArchiver
if err = sess.Where("repo_id=?", repoID).Find(&archives); err != nil {
return err
}
for _, v := range archives {
v.Repo = repo
p, _ := v.RelativePath()
removeStorageWithNotice(sess, storage.RepoArchives, "Delete repo archive file", p)
}
if _, err := sess.Delete(&RepoArchiver{RepoID: repoID}); err != nil {
return err
}
if repo.NumForks > 0 { if repo.NumForks > 0 {
if _, err = sess.Exec("UPDATE `repository` SET fork_id=0,is_fork=? WHERE fork_id=?", false, repo.ID); err != nil { if _, err = sess.Exec("UPDATE `repository` SET fork_id=0,is_fork=? WHERE fork_id=?", false, repo.ID); err != nil {
log.Error("reset 'fork_id' and 'is_fork': %v", err) log.Error("reset 'fork_id' and 'is_fork': %v", err)
@ -1768,64 +1784,45 @@ func DeleteRepositoryArchives(ctx context.Context) error {
func DeleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration) error { func DeleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration) error {
log.Trace("Doing: ArchiveCleanup") log.Trace("Doing: ArchiveCleanup")
if err := x.Where("id > 0").Iterate(new(Repository), func(idx int, bean interface{}) error { for {
return deleteOldRepositoryArchives(ctx, olderThan, idx, bean) var archivers []RepoArchiver
}); err != nil { err := x.Where("created_unix < ?", time.Now().Add(-olderThan).Unix()).
log.Trace("Error: ArchiveClean: %v", err) Asc("created_unix").
return err Limit(100).
} Find(&archivers)
log.Trace("Finished: ArchiveCleanup")
return nil
}
func deleteOldRepositoryArchives(ctx context.Context, olderThan time.Duration, idx int, bean interface{}) error {
repo := bean.(*Repository)
basePath := filepath.Join(repo.RepoPath(), "archives")
for _, ty := range []string{"zip", "targz"} {
select {
case <-ctx.Done():
return ErrCancelledf("before deleting old repository archives with filetype %s for %s", ty, repo.FullName())
default:
}
path := filepath.Join(basePath, ty)
file, err := os.Open(path)
if err != nil {
if !os.IsNotExist(err) {
log.Warn("Unable to open directory %s: %v", path, err)
return err
}
// If the directory doesn't exist, that's okay.
continue
}
files, err := file.Readdir(0)
file.Close()
if err != nil { if err != nil {
log.Warn("Unable to read directory %s: %v", path, err) log.Trace("Error: ArchiveClean: %v", err)
return err return err
} }
minimumOldestTime := time.Now().Add(-olderThan) for _, archiver := range archivers {
for _, info := range files { if err := deleteOldRepoArchiver(ctx, &archiver); err != nil {
if info.ModTime().Before(minimumOldestTime) && !info.IsDir() { return err
select {
case <-ctx.Done():
return ErrCancelledf("before deleting old repository archive file %s with filetype %s for %s", info.Name(), ty, repo.FullName())
default:
}
toDelete := filepath.Join(path, info.Name())
// This is a best-effort purge, so we do not check error codes to confirm removal.
if err = util.Remove(toDelete); err != nil {
log.Trace("Unable to delete %s, but proceeding: %v", toDelete, err)
}
} }
} }
if len(archivers) < 100 {
break
}
} }
log.Trace("Finished: ArchiveCleanup")
return nil
}
var delRepoArchiver = new(RepoArchiver)
func deleteOldRepoArchiver(ctx context.Context, archiver *RepoArchiver) error {
p, err := archiver.RelativePath()
if err != nil {
return err
}
_, err = x.ID(archiver.ID).Delete(delRepoArchiver)
if err != nil {
return err
}
if err := storage.RepoArchives.Delete(p); err != nil {
log.Error("delete repo archive file failed: %v", err)
}
return nil return nil
} }

@ -0,0 +1,86 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package models
import (
"fmt"
"code.gitea.io/gitea/modules/git"
"code.gitea.io/gitea/modules/timeutil"
)
// RepoArchiverStatus represents repo archive status
type RepoArchiverStatus int
// enumerate all repo archive statuses
const (
RepoArchiverGenerating = iota // the archiver is generating
RepoArchiverReady // it's ready
)
// RepoArchiver represents all archivers
type RepoArchiver struct {
ID int64 `xorm:"pk autoincr"`
RepoID int64 `xorm:"index unique(s)"`
Repo *Repository `xorm:"-"`
Type git.ArchiveType `xorm:"unique(s)"`
Status RepoArchiverStatus
CommitID string `xorm:"VARCHAR(40) unique(s)"`
CreatedUnix timeutil.TimeStamp `xorm:"INDEX NOT NULL created"`
}
// LoadRepo loads repository
func (archiver *RepoArchiver) LoadRepo() (*Repository, error) {
if archiver.Repo != nil {
return archiver.Repo, nil
}
var repo Repository
has, err := x.ID(archiver.RepoID).Get(&repo)
if err != nil {
return nil, err
}
if !has {
return nil, ErrRepoNotExist{
ID: archiver.RepoID,
}
}
return &repo, nil
}
// RelativePath returns relative path
func (archiver *RepoArchiver) RelativePath() (string, error) {
repo, err := archiver.LoadRepo()
if err != nil {
return "", err
}
return fmt.Sprintf("%s/%s/%s.%s", repo.FullName(), archiver.CommitID[:2], archiver.CommitID, archiver.Type.String()), nil
}
// GetRepoArchiver get an archiver
func GetRepoArchiver(ctx DBContext, repoID int64, tp git.ArchiveType, commitID string) (*RepoArchiver, error) {
var archiver RepoArchiver
has, err := ctx.e.Where("repo_id=?", repoID).And("`type`=?", tp).And("commit_id=?", commitID).Get(&archiver)
if err != nil {
return nil, err
}
if has {
return &archiver, nil
}
return nil, nil
}
// AddRepoArchiver adds an archiver
func AddRepoArchiver(ctx DBContext, archiver *RepoArchiver) error {
_, err := ctx.e.Insert(archiver)
return err
}
// UpdateRepoArchiverStatus updates archiver's status
func UpdateRepoArchiverStatus(ctx DBContext, archiver *RepoArchiver) error {
_, err := ctx.e.ID(archiver.ID).Cols("status").Update(archiver)
return err
}

@ -74,6 +74,8 @@ func MainTest(m *testing.M, pathToGiteaRoot string) {
setting.RepoAvatar.Storage.Path = filepath.Join(setting.AppDataPath, "repo-avatars") setting.RepoAvatar.Storage.Path = filepath.Join(setting.AppDataPath, "repo-avatars")
setting.RepoArchive.Storage.Path = filepath.Join(setting.AppDataPath, "repo-archive")
if err = storage.Init(); err != nil { if err = storage.Init(); err != nil {
fatalTestError("storage.Init: %v\n", err) fatalTestError("storage.Init: %v\n", err)
} }

@ -380,6 +380,21 @@ func (ctx *Context) ServeFile(file string, names ...string) {
http.ServeFile(ctx.Resp, ctx.Req, file) http.ServeFile(ctx.Resp, ctx.Req, file)
} }
// ServeStream serves file via io stream
func (ctx *Context) ServeStream(rd io.Reader, name string) {
ctx.Resp.Header().Set("Content-Description", "File Transfer")
ctx.Resp.Header().Set("Content-Type", "application/octet-stream")
ctx.Resp.Header().Set("Content-Disposition", "attachment; filename="+name)
ctx.Resp.Header().Set("Content-Transfer-Encoding", "binary")
ctx.Resp.Header().Set("Expires", "0")
ctx.Resp.Header().Set("Cache-Control", "must-revalidate")
ctx.Resp.Header().Set("Pragma", "public")
_, err := io.Copy(ctx.Resp, rd)
if err != nil {
ctx.ServerError("Download file failed", err)
}
}
// Error returned an error to web browser // Error returned an error to web browser
func (ctx *Context) Error(status int, contents ...string) { func (ctx *Context) Error(status int, contents ...string) {
var v = http.StatusText(status) var v = http.StatusText(status)

@ -0,0 +1,59 @@
// Copyright 2021 The Gitea Authors. All rights reserved.
// Use of this source code is governed by a MIT-style
// license that can be found in the LICENSE file.
package doctor
import (
"os"
"path/filepath"
"code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/util"
)
func checkOldArchives(logger log.Logger, autofix bool) error {
numRepos := 0
numReposUpdated := 0
err := iterateRepositories(func(repo *models.Repository) error {
if repo.IsEmpty {
return nil
}
p := filepath.Join(repo.RepoPath(), "archives")
isDir, err := util.IsDir(p)
if err != nil {
log.Warn("check if %s is directory failed: %v", p, err)
}
if isDir {
numRepos++
if autofix {
if err := os.RemoveAll(p); err == nil {
numReposUpdated++
} else {
log.Warn("remove %s failed: %v", p, err)
}
}
}
return nil
})
if autofix {
logger.Info("%d / %d old archives in repository deleted", numReposUpdated, numRepos)
} else {
logger.Info("%d old archives in repository need to be deleted", numRepos)
}
return err
}
func init() {
Register(&Check{
Title: "Check old archives",
Name: "check-old-archives",
IsDefault: false,
Run: checkOldArchives,
Priority: 7,
})
}

@ -8,6 +8,7 @@ package git
import ( import (
"context" "context"
"fmt" "fmt"
"io"
"path/filepath" "path/filepath"
"strings" "strings"
) )
@ -33,32 +34,28 @@ func (a ArchiveType) String() string {
return "unknown" return "unknown"
} }
// CreateArchiveOpts represents options for creating an archive
type CreateArchiveOpts struct {
Format ArchiveType
Prefix bool
}
// CreateArchive create archive content to the target path // CreateArchive create archive content to the target path
func (c *Commit) CreateArchive(ctx context.Context, target string, opts CreateArchiveOpts) error { func (repo *Repository) CreateArchive(ctx context.Context, format ArchiveType, target io.Writer, usePrefix bool, commitID string) error {
if opts.Format.String() == "unknown" { if format.String() == "unknown" {
return fmt.Errorf("unknown format: %v", opts.Format) return fmt.Errorf("unknown format: %v", format)
} }
args := []string{ args := []string{
"archive", "archive",
} }
if opts.Prefix { if usePrefix {
args = append(args, "--prefix="+filepath.Base(strings.TrimSuffix(c.repo.Path, ".git"))+"/") args = append(args, "--prefix="+filepath.Base(strings.TrimSuffix(repo.Path, ".git"))+"/")
} }
args = append(args, args = append(args,
"--format="+opts.Format.String(), "--format="+format.String(),
"-o", commitID,
target,
c.ID.String(),
) )
_, err := NewCommandContext(ctx, args...).RunInDir(c.repo.Path) var stderr strings.Builder
return err err := NewCommandContext(ctx, args...).RunInDirPipeline(repo.Path, target, &stderr)
if err != nil {
return ConcatenateError(err, stderr.String())
}
return nil
} }

@ -251,6 +251,10 @@ var (
} }
RepoRootPath string RepoRootPath string
ScriptType = "bash" ScriptType = "bash"
RepoArchive = struct {
Storage
}{}
) )
func newRepository() { func newRepository() {
@ -328,4 +332,6 @@ func newRepository() {
if !filepath.IsAbs(Repository.Upload.TempPath) { if !filepath.IsAbs(Repository.Upload.TempPath) {
Repository.Upload.TempPath = path.Join(AppWorkPath, Repository.Upload.TempPath) Repository.Upload.TempPath = path.Join(AppWorkPath, Repository.Upload.TempPath)
} }
RepoArchive.Storage = getStorage("repo-archive", "", nil)
} }

@ -43,6 +43,10 @@ func getStorage(name, typ string, targetSec *ini.Section) Storage {
sec.Key("MINIO_LOCATION").MustString("us-east-1") sec.Key("MINIO_LOCATION").MustString("us-east-1")
sec.Key("MINIO_USE_SSL").MustBool(false) sec.Key("MINIO_USE_SSL").MustBool(false)
if targetSec == nil {
targetSec, _ = Cfg.NewSection(name)
}
var storage Storage var storage Storage
storage.Section = targetSec storage.Section = targetSec
storage.Type = typ storage.Type = typ

@ -114,6 +114,9 @@ var (
Avatars ObjectStorage Avatars ObjectStorage
// RepoAvatars represents repository avatars storage // RepoAvatars represents repository avatars storage
RepoAvatars ObjectStorage RepoAvatars ObjectStorage
// RepoArchives represents repository archives storage
RepoArchives ObjectStorage
) )
// Init init the stoarge // Init init the stoarge
@ -130,7 +133,11 @@ func Init() error {
return err return err
} }
return initLFS() if err := initLFS(); err != nil {
return err
}
return initRepoArchives()
} }
// NewStorage takes a storage type and some config and returns an ObjectStorage or an error // NewStorage takes a storage type and some config and returns an ObjectStorage or an error
@ -169,3 +176,9 @@ func initRepoAvatars() (err error) {
RepoAvatars, err = NewStorage(setting.RepoAvatar.Storage.Type, &setting.RepoAvatar.Storage) RepoAvatars, err = NewStorage(setting.RepoAvatar.Storage.Type, &setting.RepoAvatar.Storage)
return return
} }
func initRepoArchives() (err error) {
log.Info("Initialising Repository Archive storage with type: %s", setting.RepoArchive.Storage.Type)
RepoArchives, err = NewStorage(setting.RepoArchive.Storage.Type, &setting.RepoArchive.Storage)
return
}

@ -18,6 +18,7 @@ import (
api "code.gitea.io/gitea/modules/structs" api "code.gitea.io/gitea/modules/structs"
"code.gitea.io/gitea/modules/web" "code.gitea.io/gitea/modules/web"
"code.gitea.io/gitea/routers/common" "code.gitea.io/gitea/routers/common"
"code.gitea.io/gitea/routers/web/repo"
) )
// GetRawFile get a file by path on a repository // GetRawFile get a file by path on a repository
@ -126,7 +127,7 @@ func GetArchive(ctx *context.APIContext) {
ctx.Repo.GitRepo = gitRepo ctx.Repo.GitRepo = gitRepo
defer gitRepo.Close() defer gitRepo.Close()
common.Download(ctx.Context) repo.Download(ctx.Context)
} }
// GetEditorconfig get editor config of a repository // GetEditorconfig get editor config of a repository

@ -7,7 +7,6 @@ package common
import ( import (
"fmt" "fmt"
"io" "io"
"net/http"
"path" "path"
"path/filepath" "path/filepath"
"strings" "strings"
@ -19,7 +18,6 @@ import (
"code.gitea.io/gitea/modules/log" "code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/setting" "code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/typesniffer" "code.gitea.io/gitea/modules/typesniffer"
"code.gitea.io/gitea/services/archiver"
) )
// ServeBlob download a git.Blob // ServeBlob download a git.Blob
@ -41,30 +39,6 @@ func ServeBlob(ctx *context.Context, blob *git.Blob) error {
return ServeData(ctx, ctx.Repo.TreePath, blob.Size(), dataRc) return ServeData(ctx, ctx.Repo.TreePath, blob.Size(), dataRc)
} }
// Download an archive of a repository
func Download(ctx *context.Context) {
uri := ctx.Params("*")
aReq := archiver.DeriveRequestFrom(ctx, uri)
if aReq == nil {
ctx.Error(http.StatusNotFound)
return
}
downloadName := ctx.Repo.Repository.Name + "-" + aReq.GetArchiveName()
complete := aReq.IsComplete()
if !complete {
aReq = archiver.ArchiveRepository(aReq)
complete = aReq.WaitForCompletion(ctx)
}
if complete {
ctx.ServeFile(aReq.GetArchivePath(), downloadName)
} else {
ctx.Error(http.StatusNotFound)
}
}
// ServeData download file from io.Reader // ServeData download file from io.Reader
func ServeData(ctx *context.Context, name string, size int64, reader io.Reader) error { func ServeData(ctx *context.Context, name string, size int64, reader io.Reader) error {
buf := make([]byte, 1024) buf := make([]byte, 1024)

@ -33,6 +33,7 @@ import (
"code.gitea.io/gitea/routers/common" "code.gitea.io/gitea/routers/common"
"code.gitea.io/gitea/routers/private" "code.gitea.io/gitea/routers/private"
web_routers "code.gitea.io/gitea/routers/web" web_routers "code.gitea.io/gitea/routers/web"
"code.gitea.io/gitea/services/archiver"
"code.gitea.io/gitea/services/auth" "code.gitea.io/gitea/services/auth"
"code.gitea.io/gitea/services/mailer" "code.gitea.io/gitea/services/mailer"
mirror_service "code.gitea.io/gitea/services/mirror" mirror_service "code.gitea.io/gitea/services/mirror"
@ -63,6 +64,9 @@ func NewServices() {
mailer.NewContext() mailer.NewContext()
_ = cache.NewContext() _ = cache.NewContext()
notification.NewContext() notification.NewContext()
if err := archiver.Init(); err != nil {
log.Fatal("archiver init failed: %v", err)
}
} }
// GlobalInit is for global configuration reload-able. // GlobalInit is for global configuration reload-able.

@ -15,8 +15,10 @@ import (
"code.gitea.io/gitea/models" "code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/base" "code.gitea.io/gitea/modules/base"
"code.gitea.io/gitea/modules/context" "code.gitea.io/gitea/modules/context"
"code.gitea.io/gitea/modules/graceful"
"code.gitea.io/gitea/modules/log" "code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/setting" "code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/storage"
"code.gitea.io/gitea/modules/web" "code.gitea.io/gitea/modules/web"
archiver_service "code.gitea.io/gitea/services/archiver" archiver_service "code.gitea.io/gitea/services/archiver"
"code.gitea.io/gitea/services/forms" "code.gitea.io/gitea/services/forms"
@ -364,25 +366,123 @@ func RedirectDownload(ctx *context.Context) {
ctx.Error(http.StatusNotFound) ctx.Error(http.StatusNotFound)
} }
// Download an archive of a repository
func Download(ctx *context.Context) {
uri := ctx.Params("*")
aReq, err := archiver_service.NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, uri)
if err != nil {
ctx.ServerError("archiver_service.NewRequest", err)
return
}
if aReq == nil {
ctx.Error(http.StatusNotFound)
return
}
archiver, err := models.GetRepoArchiver(models.DefaultDBContext(), aReq.RepoID, aReq.Type, aReq.CommitID)
if err != nil {
ctx.ServerError("models.GetRepoArchiver", err)
return
}
if archiver != nil && archiver.Status == models.RepoArchiverReady {
download(ctx, aReq.GetArchiveName(), archiver)
return
}
if err := archiver_service.StartArchive(aReq); err != nil {
ctx.ServerError("archiver_service.StartArchive", err)
return
}
var times int
var t = time.NewTicker(time.Second * 1)
defer t.Stop()
for {
select {
case <-graceful.GetManager().HammerContext().Done():
log.Warn("exit archive download because system stop")
return
case <-t.C:
if times > 20 {
ctx.ServerError("wait download timeout", nil)
return
}
times++
archiver, err = models.GetRepoArchiver(models.DefaultDBContext(), aReq.RepoID, aReq.Type, aReq.CommitID)
if err != nil {
ctx.ServerError("archiver_service.StartArchive", err)
return
}
if archiver != nil && archiver.Status == models.RepoArchiverReady {
download(ctx, aReq.GetArchiveName(), archiver)
return
}
}
}
}
func download(ctx *context.Context, archiveName string, archiver *models.RepoArchiver) {
downloadName := ctx.Repo.Repository.Name + "-" + archiveName
rPath, err := archiver.RelativePath()
if err != nil {
ctx.ServerError("archiver.RelativePath", err)
return
}
if setting.RepoArchive.ServeDirect {
//If we have a signed url (S3, object storage), redirect to this directly.
u, err := storage.RepoArchives.URL(rPath, downloadName)
if u != nil && err == nil {
ctx.Redirect(u.String())
return
}
}
//If we have matched and access to release or issue
fr, err := storage.RepoArchives.Open(rPath)
if err != nil {
ctx.ServerError("Open", err)
return
}
defer fr.Close()
ctx.ServeStream(fr, downloadName)
}
// InitiateDownload will enqueue an archival request, as needed. It may submit // InitiateDownload will enqueue an archival request, as needed. It may submit
// a request that's already in-progress, but the archiver service will just // a request that's already in-progress, but the archiver service will just
// kind of drop it on the floor if this is the case. // kind of drop it on the floor if this is the case.
func InitiateDownload(ctx *context.Context) { func InitiateDownload(ctx *context.Context) {
uri := ctx.Params("*") uri := ctx.Params("*")
aReq := archiver_service.DeriveRequestFrom(ctx, uri) aReq, err := archiver_service.NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, uri)
if err != nil {
ctx.ServerError("archiver_service.NewRequest", err)
return
}
if aReq == nil { if aReq == nil {
ctx.Error(http.StatusNotFound) ctx.Error(http.StatusNotFound)
return return
} }
complete := aReq.IsComplete() archiver, err := models.GetRepoArchiver(models.DefaultDBContext(), aReq.RepoID, aReq.Type, aReq.CommitID)
if !complete { if err != nil {
aReq = archiver_service.ArchiveRepository(aReq) ctx.ServerError("archiver_service.StartArchive", err)
complete, _ = aReq.TimedWaitForCompletion(ctx, 2*time.Second) return
}
if archiver == nil || archiver.Status != models.RepoArchiverReady {
if err := archiver_service.StartArchive(aReq); err != nil {
ctx.ServerError("archiver_service.StartArchive", err)
return
}
}
var completed bool
if archiver != nil && archiver.Status == models.RepoArchiverReady {
completed = true
} }
ctx.JSON(http.StatusOK, map[string]interface{}{ ctx.JSON(http.StatusOK, map[string]interface{}{
"complete": complete, "complete": completed,
}) })
} }

@ -22,7 +22,6 @@ import (
"code.gitea.io/gitea/modules/validation" "code.gitea.io/gitea/modules/validation"
"code.gitea.io/gitea/modules/web" "code.gitea.io/gitea/modules/web"
"code.gitea.io/gitea/routers/api/v1/misc" "code.gitea.io/gitea/routers/api/v1/misc"
"code.gitea.io/gitea/routers/common"
"code.gitea.io/gitea/routers/web/admin" "code.gitea.io/gitea/routers/web/admin"
"code.gitea.io/gitea/routers/web/dev" "code.gitea.io/gitea/routers/web/dev"
"code.gitea.io/gitea/routers/web/events" "code.gitea.io/gitea/routers/web/events"
@ -888,7 +887,7 @@ func RegisterRoutes(m *web.Route) {
}, context.RepoRef(), repo.MustBeNotEmpty, context.RequireRepoReaderOr(models.UnitTypeCode)) }, context.RepoRef(), repo.MustBeNotEmpty, context.RequireRepoReaderOr(models.UnitTypeCode))
m.Group("/archive", func() { m.Group("/archive", func() {
m.Get("/*", common.Download) m.Get("/*", repo.Download)
m.Post("/*", repo.InitiateDownload) m.Post("/*", repo.InitiateDownload)
}, repo.MustBeNotEmpty, reqRepoCodeReader) }, repo.MustBeNotEmpty, reqRepoCodeReader)

@ -6,22 +6,20 @@
package archiver package archiver
import ( import (
"errors"
"fmt"
"io" "io"
"io/ioutil"
"os" "os"
"path"
"regexp" "regexp"
"strings" "strings"
"sync"
"time"
"code.gitea.io/gitea/modules/base" "code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/context"
"code.gitea.io/gitea/modules/git" "code.gitea.io/gitea/modules/git"
"code.gitea.io/gitea/modules/graceful" "code.gitea.io/gitea/modules/graceful"
"code.gitea.io/gitea/modules/log" "code.gitea.io/gitea/modules/log"
"code.gitea.io/gitea/modules/queue"
"code.gitea.io/gitea/modules/setting" "code.gitea.io/gitea/modules/setting"
"code.gitea.io/gitea/modules/util" "code.gitea.io/gitea/modules/storage"
) )
// ArchiveRequest defines the parameters of an archive request, which notably // ArchiveRequest defines the parameters of an archive request, which notably
@ -30,223 +28,174 @@ import (
// This is entirely opaque to external entities, though, and mostly used as a // This is entirely opaque to external entities, though, and mostly used as a
// handle elsewhere. // handle elsewhere.
type ArchiveRequest struct { type ArchiveRequest struct {
uri string RepoID int64
repo *git.Repository refName string
refName string Type git.ArchiveType
ext string CommitID string
archivePath string
archiveType git.ArchiveType
archiveComplete bool
commit *git.Commit
cchan chan struct{}
} }
var archiveInProgress []*ArchiveRequest
var archiveMutex sync.Mutex
// SHA1 hashes will only go up to 40 characters, but SHA256 hashes will go all // SHA1 hashes will only go up to 40 characters, but SHA256 hashes will go all
// the way to 64. // the way to 64.
var shaRegex = regexp.MustCompile(`^[0-9a-f]{4,64}$`) var shaRegex = regexp.MustCompile(`^[0-9a-f]{4,64}$`)
// These facilitate testing, by allowing the unit tests to control (to some extent) // NewRequest creates an archival request, based on the URI. The
// the goroutine used for processing the queue.
var archiveQueueMutex *sync.Mutex
var archiveQueueStartCond *sync.Cond
var archiveQueueReleaseCond *sync.Cond
// GetArchivePath returns the path from which we can serve this archive.
func (aReq *ArchiveRequest) GetArchivePath() string {
return aReq.archivePath
}
// GetArchiveName returns the name of the caller, based on the ref used by the
// caller to create this request.
func (aReq *ArchiveRequest) GetArchiveName() string {
return aReq.refName + aReq.ext
}
// IsComplete returns the completion status of this request.
func (aReq *ArchiveRequest) IsComplete() bool {
return aReq.archiveComplete
}
// WaitForCompletion will wait for this request to complete, with no timeout.
// It returns whether the archive was actually completed, as the channel could
// have also been closed due to an error.
func (aReq *ArchiveRequest) WaitForCompletion(ctx *context.Context) bool {
select {
case <-aReq.cchan:
case <-ctx.Done():
}
return aReq.IsComplete()
}
// TimedWaitForCompletion will wait for this request to complete, with timeout
// happening after the specified Duration. It returns whether the archive is
// now complete and whether we hit the timeout or not. The latter may not be
// useful if the request is complete or we started to shutdown.
func (aReq *ArchiveRequest) TimedWaitForCompletion(ctx *context.Context, dur time.Duration) (bool, bool) {
timeout := false
select {
case <-time.After(dur):
timeout = true
case <-aReq.cchan:
case <-ctx.Done():
}
return aReq.IsComplete(), timeout
}
// The caller must hold the archiveMutex across calls to getArchiveRequest.
func getArchiveRequest(repo *git.Repository, commit *git.Commit, archiveType git.ArchiveType) *ArchiveRequest {
for _, r := range archiveInProgress {
// Need to be referring to the same repository.
if r.repo.Path == repo.Path && r.commit.ID == commit.ID && r.archiveType == archiveType {
return r
}
}
return nil
}
// DeriveRequestFrom creates an archival request, based on the URI. The
// resulting ArchiveRequest is suitable for being passed to ArchiveRepository() // resulting ArchiveRequest is suitable for being passed to ArchiveRepository()
// if it's determined that the request still needs to be satisfied. // if it's determined that the request still needs to be satisfied.
func DeriveRequestFrom(ctx *context.Context, uri string) *ArchiveRequest { func NewRequest(repoID int64, repo *git.Repository, uri string) (*ArchiveRequest, error) {
if ctx.Repo == nil || ctx.Repo.GitRepo == nil {
log.Trace("Repo not initialized")
return nil
}
r := &ArchiveRequest{ r := &ArchiveRequest{
uri: uri, RepoID: repoID,
repo: ctx.Repo.GitRepo,
} }
var ext string
switch { switch {
case strings.HasSuffix(uri, ".zip"): case strings.HasSuffix(uri, ".zip"):
r.ext = ".zip" ext = ".zip"
r.archivePath = path.Join(r.repo.Path, "archives/zip") r.Type = git.ZIP
r.archiveType = git.ZIP
case strings.HasSuffix(uri, ".tar.gz"): case strings.HasSuffix(uri, ".tar.gz"):
r.ext = ".tar.gz" ext = ".tar.gz"
r.archivePath = path.Join(r.repo.Path, "archives/targz") r.Type = git.TARGZ
r.archiveType = git.TARGZ
default: default:
log.Trace("Unknown format: %s", uri) return nil, fmt.Errorf("Unknown format: %s", uri)
return nil
} }
r.refName = strings.TrimSuffix(r.uri, r.ext) r.refName = strings.TrimSuffix(uri, ext)
isDir, err := util.IsDir(r.archivePath)
if err != nil {
ctx.ServerError("Download -> util.IsDir(archivePath)", err)
return nil
}
if !isDir {
if err := os.MkdirAll(r.archivePath, os.ModePerm); err != nil {
ctx.ServerError("Download -> os.MkdirAll(archivePath)", err)
return nil
}
}
var err error
// Get corresponding commit. // Get corresponding commit.
if r.repo.IsBranchExist(r.refName) { if repo.IsBranchExist(r.refName) {
r.commit, err = r.repo.GetBranchCommit(r.refName) r.CommitID, err = repo.GetBranchCommitID(r.refName)
if err != nil { if err != nil {
ctx.ServerError("GetBranchCommit", err) return nil, err
return nil
} }
} else if r.repo.IsTagExist(r.refName) { } else if repo.IsTagExist(r.refName) {
r.commit, err = r.repo.GetTagCommit(r.refName) r.CommitID, err = repo.GetTagCommitID(r.refName)
if err != nil { if err != nil {
ctx.ServerError("GetTagCommit", err) return nil, err
return nil
} }
} else if shaRegex.MatchString(r.refName) { } else if shaRegex.MatchString(r.refName) {
r.commit, err = r.repo.GetCommit(r.refName) if repo.IsCommitExist(r.refName) {
if err != nil { r.CommitID = r.refName
ctx.NotFound("GetCommit", nil) } else {
return nil return nil, git.ErrNotExist{
ID: r.refName,
}
} }
} else { } else {
ctx.NotFound("DeriveRequestFrom", nil) return nil, fmt.Errorf("Unknow ref %s type", r.refName)
return nil
} }
archiveMutex.Lock() return r, nil
defer archiveMutex.Unlock() }
if rExisting := getArchiveRequest(r.repo, r.commit, r.archiveType); rExisting != nil {
return rExisting // GetArchiveName returns the name of the caller, based on the ref used by the
} // caller to create this request.
func (aReq *ArchiveRequest) GetArchiveName() string {
return strings.ReplaceAll(aReq.refName, "/", "-") + "." + aReq.Type.String()
}
r.archivePath = path.Join(r.archivePath, base.ShortSha(r.commit.ID.String())+r.ext) func doArchive(r *ArchiveRequest) (*models.RepoArchiver, error) {
r.archiveComplete, err = util.IsFile(r.archivePath) ctx, commiter, err := models.TxDBContext()
if err != nil { if err != nil {
ctx.ServerError("util.IsFile", err) return nil, err
return nil
} }
return r defer commiter.Close()
}
func doArchive(r *ArchiveRequest) { archiver, err := models.GetRepoArchiver(ctx, r.RepoID, r.Type, r.CommitID)
var (
err error
tmpArchive *os.File
destArchive *os.File
)
// Close the channel to indicate to potential waiters that this request
// has finished.
defer close(r.cchan)
// It could have happened that we enqueued two archival requests, due to
// race conditions and difficulties in locking. Do one last check that
// the archive we're referring to doesn't already exist. If it does exist,
// then just mark the request as complete and move on.
isFile, err := util.IsFile(r.archivePath)
if err != nil { if err != nil {
log.Error("Unable to check if %s util.IsFile: %v. Will ignore and recreate.", r.archivePath, err) return nil, err
} }
if isFile {
r.archiveComplete = true if archiver != nil {
return // FIXME: If another process are generating it, we think it's not ready and just return
// Or we should wait until the archive generated.
if archiver.Status == models.RepoArchiverGenerating {
return nil, nil
}
} else {
archiver = &models.RepoArchiver{
RepoID: r.RepoID,
Type: r.Type,
CommitID: r.CommitID,
Status: models.RepoArchiverGenerating,
}
if err := models.AddRepoArchiver(ctx, archiver); err != nil {
return nil, err
}
} }
// Create a temporary file to use while the archive is being built. We rPath, err := archiver.RelativePath()
// will then copy it into place (r.archivePath) once it's fully
// constructed.
tmpArchive, err = ioutil.TempFile("", "archive")
if err != nil { if err != nil {
log.Error("Unable to create a temporary archive file! Error: %v", err) return nil, err
return }
_, err = storage.RepoArchives.Stat(rPath)
if err == nil {
if archiver.Status == models.RepoArchiverGenerating {
archiver.Status = models.RepoArchiverReady
return archiver, models.UpdateRepoArchiverStatus(ctx, archiver)
}
return archiver, nil
}
if !errors.Is(err, os.ErrNotExist) {
return nil, fmt.Errorf("unable to stat archive: %v", err)
} }
rd, w := io.Pipe()
defer func() { defer func() {
tmpArchive.Close() w.Close()
os.Remove(tmpArchive.Name()) rd.Close()
}() }()
var done = make(chan error)
repo, err := archiver.LoadRepo()
if err != nil {
return nil, fmt.Errorf("archiver.LoadRepo failed: %v", err)
}
if err = r.commit.CreateArchive(graceful.GetManager().ShutdownContext(), tmpArchive.Name(), git.CreateArchiveOpts{ gitRepo, err := git.OpenRepository(repo.RepoPath())
Format: r.archiveType, if err != nil {
Prefix: setting.Repository.PrefixArchiveFiles, return nil, err
}); err != nil {
log.Error("Download -> CreateArchive "+tmpArchive.Name(), err)
return
} }
defer gitRepo.Close()
go func(done chan error, w *io.PipeWriter, archiver *models.RepoArchiver, gitRepo *git.Repository) {
defer func() {
if r := recover(); r != nil {
done <- fmt.Errorf("%v", r)
}
}()
err = gitRepo.CreateArchive(
graceful.GetManager().ShutdownContext(),
archiver.Type,
w,
setting.Repository.PrefixArchiveFiles,
archiver.CommitID,
)
_ = w.CloseWithError(err)
done <- err
}(done, w, archiver, gitRepo)
// TODO: add lfs data to zip
// TODO: add submodule data to zip
// Now we copy it into place if _, err := storage.RepoArchives.Save(rPath, rd, -1); err != nil {
if destArchive, err = os.Create(r.archivePath); err != nil { return nil, fmt.Errorf("unable to write archive: %v", err)
log.Error("Unable to open archive " + r.archivePath)
return
} }
_, err = io.Copy(destArchive, tmpArchive)
destArchive.Close() err = <-done
if err != nil { if err != nil {
log.Error("Unable to write archive " + r.archivePath) return nil, err
return }
if archiver.Status == models.RepoArchiverGenerating {
archiver.Status = models.RepoArchiverReady
if err = models.UpdateRepoArchiverStatus(ctx, archiver); err != nil {
return nil, err
}
} }
// Block any attempt to finalize creating a new request if we're marking return archiver, commiter.Commit()
r.archiveComplete = true
} }
// ArchiveRepository satisfies the ArchiveRequest being passed in. Processing // ArchiveRepository satisfies the ArchiveRequest being passed in. Processing
@ -255,65 +204,46 @@ func doArchive(r *ArchiveRequest) {
// anything. In all cases, the caller should be examining the *ArchiveRequest // anything. In all cases, the caller should be examining the *ArchiveRequest
// being returned for completion, as it may be different than the one they passed // being returned for completion, as it may be different than the one they passed
// in. // in.
func ArchiveRepository(request *ArchiveRequest) *ArchiveRequest { func ArchiveRepository(request *ArchiveRequest) (*models.RepoArchiver, error) {
// We'll return the request that's already been enqueued if it has been return doArchive(request)
// enqueued, or we'll immediately enqueue it if it has not been enqueued }
// and it is not marked complete.
archiveMutex.Lock() var archiverQueue queue.UniqueQueue
defer archiveMutex.Unlock()
if rExisting := getArchiveRequest(request.repo, request.commit, request.archiveType); rExisting != nil {
return rExisting
}
if request.archiveComplete {
return request
}
request.cchan = make(chan struct{}) // Init initlize archive
archiveInProgress = append(archiveInProgress, request) func Init() error {
go func() { handler := func(data ...queue.Data) {
// Wait to start, if we have the Cond for it. This is currently only for _, datum := range data {
// useful for testing, so that the start and release of queued entries archiveReq, ok := datum.(*ArchiveRequest)
// can be controlled to examine the queue. if !ok {
if archiveQueueStartCond != nil { log.Error("Unable to process provided datum: %v - not possible to cast to IndexerData", datum)
archiveQueueMutex.Lock() continue
archiveQueueStartCond.Wait() }
archiveQueueMutex.Unlock() log.Trace("ArchiverData Process: %#v", archiveReq)
if _, err := doArchive(archiveReq); err != nil {
log.Error("Archive %v faild: %v", datum, err)
}
} }
}
// Drop the mutex while we process the request. This may take a long archiverQueue = queue.CreateUniqueQueue("repo-archive", handler, new(ArchiveRequest))
// time, and it's not necessary now that we've added the reequest to if archiverQueue == nil {
// archiveInProgress. return errors.New("unable to create codes indexer queue")
doArchive(request) }
if archiveQueueReleaseCond != nil { go graceful.GetManager().RunWithShutdownFns(archiverQueue.Run)
archiveQueueMutex.Lock()
archiveQueueReleaseCond.Wait()
archiveQueueMutex.Unlock()
}
// Purge this request from the list. To do so, we'll just take the return nil
// index at which we ended up at and swap the final element into that }
// position, then chop off the now-redundant final element. The slice
// may have change in between these two segments and we may have moved,
// so we search for it here. We could perhaps avoid this search
// entirely if len(archiveInProgress) == 1, but we should verify
// correctness.
archiveMutex.Lock()
defer archiveMutex.Unlock()
idx := -1
for _idx, req := range archiveInProgress {
if req == request {
idx = _idx
break
}
}
if idx == -1 {
log.Error("ArchiveRepository: Failed to find request for removal.")
return
}
archiveInProgress = append(archiveInProgress[:idx], archiveInProgress[idx+1:]...)
}()
return request // StartArchive push the archive request to the queue
func StartArchive(request *ArchiveRequest) error {
has, err := archiverQueue.Has(request)
if err != nil {
return err
}
if has {
return nil
}
return archiverQueue.Push(request)
} }

@ -6,108 +6,75 @@ package archiver
import ( import (
"path/filepath" "path/filepath"
"sync"
"testing" "testing"
"time" "time"
"code.gitea.io/gitea/models" "code.gitea.io/gitea/models"
"code.gitea.io/gitea/modules/test" "code.gitea.io/gitea/modules/test"
"code.gitea.io/gitea/modules/util"
"github.com/stretchr/testify/assert" "github.com/stretchr/testify/assert"
) )
var queueMutex sync.Mutex
func TestMain(m *testing.M) { func TestMain(m *testing.M) {
models.MainTest(m, filepath.Join("..", "..")) models.MainTest(m, filepath.Join("..", ".."))
} }
func waitForCount(t *testing.T, num int) { func waitForCount(t *testing.T, num int) {
var numQueued int
// Wait for up to 10 seconds for the queue to be impacted.
timeout := time.Now().Add(10 * time.Second)
for {
numQueued = len(archiveInProgress)
if numQueued == num || time.Now().After(timeout) {
break
}
}
assert.Len(t, archiveInProgress, num)
}
func releaseOneEntry(t *testing.T, inFlight []*ArchiveRequest) {
var nowQueued, numQueued int
numQueued = len(archiveInProgress)
// Release one, then wait up to 10 seconds for it to complete.
queueMutex.Lock()
archiveQueueReleaseCond.Signal()
queueMutex.Unlock()
timeout := time.Now().Add(10 * time.Second)
for {
nowQueued = len(archiveInProgress)
if nowQueued != numQueued || time.Now().After(timeout) {
break
}
}
// Make sure we didn't just timeout.
assert.NotEqual(t, numQueued, nowQueued)
// Also make sure that we released only one.
assert.Equal(t, numQueued-1, nowQueued)
} }
func TestArchive_Basic(t *testing.T) { func TestArchive_Basic(t *testing.T) {
assert.NoError(t, models.PrepareTestDatabase()) assert.NoError(t, models.PrepareTestDatabase())
archiveQueueMutex = &queueMutex
archiveQueueStartCond = sync.NewCond(&queueMutex)
archiveQueueReleaseCond = sync.NewCond(&queueMutex)
defer func() {
archiveQueueMutex = nil
archiveQueueStartCond = nil
archiveQueueReleaseCond = nil
}()
ctx := test.MockContext(t, "user27/repo49") ctx := test.MockContext(t, "user27/repo49")
firstCommit, secondCommit := "51f84af23134", "aacbdfe9e1c4" firstCommit, secondCommit := "51f84af23134", "aacbdfe9e1c4"
bogusReq := DeriveRequestFrom(ctx, firstCommit+".zip")
assert.Nil(t, bogusReq)
test.LoadRepo(t, ctx, 49) test.LoadRepo(t, ctx, 49)
bogusReq = DeriveRequestFrom(ctx, firstCommit+".zip")
assert.Nil(t, bogusReq)
test.LoadGitRepo(t, ctx) test.LoadGitRepo(t, ctx)
defer ctx.Repo.GitRepo.Close() defer ctx.Repo.GitRepo.Close()
bogusReq, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".zip")
assert.NoError(t, err)
assert.NotNil(t, bogusReq)
assert.EqualValues(t, firstCommit+".zip", bogusReq.GetArchiveName())
// Check a series of bogus requests. // Check a series of bogus requests.
// Step 1, valid commit with a bad extension. // Step 1, valid commit with a bad extension.
bogusReq = DeriveRequestFrom(ctx, firstCommit+".dilbert") bogusReq, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".dilbert")
assert.Error(t, err)
assert.Nil(t, bogusReq) assert.Nil(t, bogusReq)
// Step 2, missing commit. // Step 2, missing commit.
bogusReq = DeriveRequestFrom(ctx, "dbffff.zip") bogusReq, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, "dbffff.zip")
assert.Error(t, err)
assert.Nil(t, bogusReq) assert.Nil(t, bogusReq)
// Step 3, doesn't look like branch/tag/commit. // Step 3, doesn't look like branch/tag/commit.
bogusReq = DeriveRequestFrom(ctx, "db.zip") bogusReq, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, "db.zip")
assert.Error(t, err)
assert.Nil(t, bogusReq) assert.Nil(t, bogusReq)
bogusReq, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, "master.zip")
assert.NoError(t, err)
assert.NotNil(t, bogusReq)
assert.EqualValues(t, "master.zip", bogusReq.GetArchiveName())
bogusReq, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, "test/archive.zip")
assert.NoError(t, err)
assert.NotNil(t, bogusReq)
assert.EqualValues(t, "test-archive.zip", bogusReq.GetArchiveName())
// Now two valid requests, firstCommit with valid extensions. // Now two valid requests, firstCommit with valid extensions.
zipReq := DeriveRequestFrom(ctx, firstCommit+".zip") zipReq, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".zip")
assert.NoError(t, err)
assert.NotNil(t, zipReq) assert.NotNil(t, zipReq)
tgzReq := DeriveRequestFrom(ctx, firstCommit+".tar.gz") tgzReq, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".tar.gz")
assert.NoError(t, err)
assert.NotNil(t, tgzReq) assert.NotNil(t, tgzReq)
secondReq := DeriveRequestFrom(ctx, secondCommit+".zip") secondReq, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, secondCommit+".zip")
assert.NoError(t, err)
assert.NotNil(t, secondReq) assert.NotNil(t, secondReq)
inFlight := make([]*ArchiveRequest, 3) inFlight := make([]*ArchiveRequest, 3)
@ -128,41 +95,9 @@ func TestArchive_Basic(t *testing.T) {
// Sleep two seconds to make sure the queue doesn't change. // Sleep two seconds to make sure the queue doesn't change.
time.Sleep(2 * time.Second) time.Sleep(2 * time.Second)
assert.Len(t, archiveInProgress, 3)
zipReq2, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".zip")
// Release them all, they'll then stall at the archiveQueueReleaseCond while assert.NoError(t, err)
// we examine the queue state.
queueMutex.Lock()
archiveQueueStartCond.Broadcast()
queueMutex.Unlock()
// Iterate through all of the in-flight requests and wait for their
// completion.
for _, req := range inFlight {
req.WaitForCompletion(ctx)
}
for _, req := range inFlight {
assert.True(t, req.IsComplete())
exist, err := util.IsExist(req.GetArchivePath())
assert.NoError(t, err)
assert.True(t, exist)
}
arbitraryReq := inFlight[0]
// Reopen the channel so we don't double-close, mark it incomplete. We're
// going to run it back through the archiver, and it should get marked
// complete again.
arbitraryReq.cchan = make(chan struct{})
arbitraryReq.archiveComplete = false
doArchive(arbitraryReq)
assert.True(t, arbitraryReq.IsComplete())
// Queues should not have drained yet, because we haven't released them.
// Do so now.
assert.Len(t, archiveInProgress, 3)
zipReq2 := DeriveRequestFrom(ctx, firstCommit+".zip")
// This zipReq should match what's sitting in the queue, as we haven't // This zipReq should match what's sitting in the queue, as we haven't
// let it release yet. From the consumer's point of view, this looks like // let it release yet. From the consumer's point of view, this looks like
// a long-running archive task. // a long-running archive task.
@ -173,46 +108,22 @@ func TestArchive_Basic(t *testing.T) {
// predecessor has cleared out of the queue. // predecessor has cleared out of the queue.
ArchiveRepository(zipReq2) ArchiveRepository(zipReq2)
// Make sure the queue hasn't grown any.
assert.Len(t, archiveInProgress, 3)
// Make sure the queue drains properly
releaseOneEntry(t, inFlight)
assert.Len(t, archiveInProgress, 2)
releaseOneEntry(t, inFlight)
assert.Len(t, archiveInProgress, 1)
releaseOneEntry(t, inFlight)
assert.Empty(t, archiveInProgress)
// Now we'll submit a request and TimedWaitForCompletion twice, before and // Now we'll submit a request and TimedWaitForCompletion twice, before and
// after we release it. We should trigger both the timeout and non-timeout // after we release it. We should trigger both the timeout and non-timeout
// cases. // cases.
var completed, timedout bool timedReq, err := NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, secondCommit+".tar.gz")
timedReq := DeriveRequestFrom(ctx, secondCommit+".tar.gz") assert.NoError(t, err)
assert.NotNil(t, timedReq) assert.NotNil(t, timedReq)
ArchiveRepository(timedReq) ArchiveRepository(timedReq)
// Guaranteed to timeout; we haven't signalled the request to start.. zipReq2, err = NewRequest(ctx.Repo.Repository.ID, ctx.Repo.GitRepo, firstCommit+".zip")
completed, timedout = timedReq.TimedWaitForCompletion(ctx, 2*time.Second) assert.NoError(t, err)
assert.False(t, completed)
assert.True(t, timedout)
queueMutex.Lock()
archiveQueueStartCond.Broadcast()
queueMutex.Unlock()
// Shouldn't timeout, we've now signalled it and it's a small request.
completed, timedout = timedReq.TimedWaitForCompletion(ctx, 15*time.Second)
assert.True(t, completed)
assert.False(t, timedout)
zipReq2 = DeriveRequestFrom(ctx, firstCommit+".zip")
// Now, we're guaranteed to have released the original zipReq from the queue. // Now, we're guaranteed to have released the original zipReq from the queue.
// Ensure that we don't get handed back the released entry somehow, but they // Ensure that we don't get handed back the released entry somehow, but they
// should remain functionally equivalent in all fields. The exception here // should remain functionally equivalent in all fields. The exception here
// is zipReq.cchan, which will be non-nil because it's a completed request. // is zipReq.cchan, which will be non-nil because it's a completed request.
// It's fine to go ahead and set it to nil now. // It's fine to go ahead and set it to nil now.
zipReq.cchan = nil
assert.Equal(t, zipReq, zipReq2) assert.Equal(t, zipReq, zipReq2)
assert.False(t, zipReq == zipReq2) assert.False(t, zipReq == zipReq2)

Loading…
Cancel
Save