Backport #20869
Some Migration Downloaders provide re-writing of CloneURLs that may point to
unallowed urls. Recheck after the CloneURL is rewritten.
Signed-off-by: Andrew Thornton <art27@cantab.net>
Signed-off-by: Andrew Thornton <art27@cantab.net>
Storing the foreign identifier of an imported issue in the database is a prerequisite to implement idempotent migrations or mirror for issues. It is a baby step towards mirroring that introduces a new table.
At the moment when an issue is created by the Gitea uploader, it fails if the issue already exists. The Gitea uploader could be modified so that, instead of failing, it looks up the database to find an existing issue. And if it does it would update the issue instead of creating a new one. However this is not currently possible because an information is missing from the database: the foreign identifier that uniquely represents the issue being migrated is not persisted. With this change, the foreign identifier is stored in the database and the Gitea uploader will then be able to run a query to figure out if a given issue being imported already exists.
The implementation of mirroring for issues, pull requests, releases, etc. can be done in three steps:
1. Store an identifier for the element being mirrored (issue, pull request...) in the database (this is the purpose of these changes)
2. Modify the Gitea uploader to be able to update an existing repository with all it contains (issues, pull request...) instead of failing if it exists
3. Optimize the Gitea uploader to speed up the updates, when possible.
The second step creates code that does not yet exist to enable idempotent migrations with the Gitea uploader. When a migration is done for the first time, the behavior is not changed. But when a migration is done for a repository that already exists, this new code is used to update it.
The third step can use the code created in the second step to optimize and speed up migrations. For instance, when a migration is resumed, an issue that has an update time that is not more recent can be skipped and only newly created issues or updated ones will be updated. Another example of optimization could be that a webhook notifies Gitea when an issue is updated. The code triggered by the webhook would download only this issue and call the code created in the second step to update the issue, as if it was in the process of an idempotent migration.
The ForeignReferences table is added to contain local and foreign ID pairs relative to a given repository. It can later be used for pull requests and other artifacts that can be mirrored. Although the foreign id could be added as a single field in issues or pull requests, it would need to be added to all tables that represent something that can be mirrored. Creating a new table makes for a simpler and more generic design. The drawback is that it requires an extra lookup to obtain the information. However, this extra information is only required during migration or mirroring and does not impact the way Gitea currently works.
The foreign identifier of an issue or pull request is similar to the identifier of an external user, which is stored in reactions, issues, etc. as OriginalPosterID and so on. The representation of a user is however different and the ability of users to link their account to an external user at a later time is also a logic that is different from what is involved in mirroring or migrations. For these reasons, despite some commonalities, it is unclear at this time how the two tables (foreign reference and external user) could be merged together.
The ForeignID field is extracted from the issue migration context so that it can be dumped in files with dump-repo and later restored via restore-repo.
The GetAllComments downloader method is introduced to simplify the implementation and not overload the Context for the purpose of pagination. It also clarifies in which context the comments are paginated and in which context they are not.
The Context interface is no longer useful for the purpose of retrieving the LocalID and ForeignID since they are now both available from the PullRequest and Issue struct. The Reviewable and Commentable interfaces replace and serve the same purpose.
The Context data member of PullRequest and Issue becomes a DownloaderContext to clarify that its purpose is not to support in memory operations while the current downloader is acting but is not otherwise persisted. It is, for instance, used by the GitLab downloader to store the IsMergeRequest boolean and sort out issues.
---
[source](https://lab.forgefriends.org/forgefriends/forgefriends/-/merge_requests/36)
Signed-off-by: Loïc Dachary <loic@dachary.org>
Co-authored-by: Loïc Dachary <loic@dachary.org>
* Some refactors related repository model
* Move more methods out of repository
* Move repository into models/repo
* Fix test
* Fix test
* some improvements
* Remove unnecessary function
Use hostmacher to replace matchlist.
And we introduce a better DialContext to do a full host/IP check, otherwise the attackers can still bypass the allow/block list by a 302 redirection.
* Make the github migration less rate limit waiting to get comment per page from repository but not per issue
* Fix lint
* adjust Downloader interface
* Fix missed reviews
* Fix test
* Remove unused struct
* Add migrating message
Signed-off-by: Andrew Thornton <art27@cantab.net>
* simplify messenger
Signed-off-by: Andrew Thornton <art27@cantab.net>
* make messenger an interface
Signed-off-by: Andrew Thornton <art27@cantab.net>
* rename
Signed-off-by: Andrew Thornton <art27@cantab.net>
* prepare for merge
Signed-off-by: Andrew Thornton <art27@cantab.net>
* as per tech
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: 6543 <6543@obermui.de>
* Implemented LFS client.
* Implemented scanning for pointer files.
* Implemented downloading of lfs files.
* Moved model-dependent code into services.
* Removed models dependency. Added TryReadPointerFromBuffer.
* Migrated code from service to module.
* Centralised storage creation.
* Removed dependency from models.
* Moved ContentStore into modules.
* Share structs between server and client.
* Moved method to services.
* Implemented lfs download on clone.
* Implemented LFS sync on clone and mirror update.
* Added form fields.
* Updated templates.
* Fixed condition.
* Use alternate endpoint.
* Added missing methods.
* Fixed typo and make linter happy.
* Detached pointer parser from gogit dependency.
* Fixed TestGetLFSRange test.
* Added context to support cancellation.
* Use ReadFull to probably read more data.
* Removed duplicated code from models.
* Moved scan implementation into pointer_scanner_nogogit.
* Changed method name.
* Added comments.
* Added more/specific log/error messages.
* Embedded lfs.Pointer into models.LFSMetaObject.
* Moved code from models to module.
* Moved code from models to module.
* Moved code from models to module.
* Reduced pointer usage.
* Embedded type.
* Use promoted fields.
* Fixed unexpected eof.
* Added unit tests.
* Implemented migration of local file paths.
* Show an error on invalid LFS endpoints.
* Hide settings if not used.
* Added LFS info to mirror struct.
* Fixed comment.
* Check LFS endpoint.
* Manage LFS settings from mirror page.
* Fixed selector.
* Adjusted selector.
* Added more tests.
* Added local filesystem migration test.
* Fixed typo.
* Reset settings.
* Added special windows path handling.
* Added unit test for HTTPClient.
* Added unit test for BasicTransferAdapter.
* Moved into util package.
* Test if LFS endpoint is allowed.
* Added support for git://
* Just use a static placeholder as the displayed url may be invalid.
* Reverted to original code.
* Added "Advanced Settings".
* Updated wording.
* Added discovery info link.
* Implemented suggestion.
* Fixed missing format parameter.
* Added Pointer.IsValid().
* Always remove model on error.
* Added suggestions.
* Use channel instead of array.
* Update routers/repo/migrate.go
* fmt
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: zeripath <art27@cantab.net>
* Ensure validation occurs on clone addresses too
Fix#14984
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix lint
Signed-off-by: Andrew Thornton <art27@cantab.net>
* fix test
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Fix api tests
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
PR #13610 unfortunately disabled importing repositories from local paths.
This PR restores this functionality.
Fix#14700
Signed-off-by: Andrew Thornton <art27@cantab.net>
* add black list and white list support for migrating repositories
* fix fmt
* fix lint
* fix vendor
* fix modules.txt
* clean diff
* specify log message
* use blocklist/allowlist
* allways use lowercase to match url
* Apply allow/block
* Settings: use existing "migrations" section
* convert domains lower case
* dont store unused value
* Block private addresses for migration by default
* fix lint
* use proposed-upstream func to detect private IP addr
* a nit
* add own error for blocked migration, add tests, imprufe api
* fix test
* fix-if-localhost-is-ipv4
* rename error & error message
* rename setting options
* Apply suggestions from code review
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: zeripath <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* Migrating reactions is just not that important
A failure during migrating reactions should not cause failure of
migration.
Signed-off-by: Andrew Thornton <art27@cantab.net>
* When checking issue reactions check the correct permission
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* first draft
* update gitea sdk to 9e280adb4da
* adapt feat of updated sdk
* releases now works
* break the Reactions loop
* use convertGiteaLabel
* fix endless loop because paggination is not supported there !!!
* rename gitea local uploader files
* pagination can bite you in the ass
* Version Checks
* lint
* docs
* rename gitea sdk import to miss future conficts
* go-swagger: dont scan the sdk structs
* make sure gitea can shutdown gracefully
* make GetPullRequests and GetIssues similar
* rm useles
* Add Test: started ...
* ... add tests ...
* Add tests and Fixing things
* Workaround missing SHA
* Adapt: Ensure that all migration requests are cancellable
(714ab71ddc)
* LINT: fix misspells in test set
* adapt ListMergeRequestAwardEmoji
* update sdk
* Return error when creating giteadownloader failed
* update sdk
* adapt new sdk
* adopt new features
* check version before err
* adapt: 'migrate service type switch page'
* optimize
* Fix DefaultBranch
* impruve
* handle subPath
* fix test
* Fix ReviewCommentPosition
* test GetReviews
* add DefaultBranch int test set
* rm unused
* Update SDK to v0.13.0
* addopt sdk changes
* found better link
* format template
* Update Docs
* Update Gitea SDK (v0.13.1)
* Ensure that all migration requests are cancellable
Signed-off-by: Andrew Thornton <art27@cantab.net>
* Use WithContext as RequestWithContext is go 1.14
Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
* fix typo
* Migrate reviews when migrating repository from github
* fix lint
* Added test and migration when external user login
* fix test
* fix commented state
* Some improvements
* fix bug when get pull request and ref original author on code comments
* Fix migrated line; Added comment for review
* Don't load all pull requests attributes
* Fix typo
* wrong change copy head
* fix tests
* fix reactions
* Fix test
* fix fmt
* fix review comment reactions
CloneAddr will contain username and password credentials and they will
get stored in system notices about failed migrations (and logs if trace
is set). Replace with OriginalURL that doesn't have those.
* Sleep longer if request speed is over github limitation
* improve code
* remove unused code
* fix lint
* Use github's rate limit remain value to determine how long to sleep
* Save reset time when finished github api request
* fix bug
* fix lint
* Add context.Context for sleep
* fix test
* improve code
* fix bug and lint
* fix import order
* Add retry for migration http/https requests
* give the more suitable name for retry configuraion items
* fix docs and lint
* Only use retryDownloader when setting > 1
In investigating #7947 it has become clear that the storage component of go-git repositories needs closing.
This PR adds this Close function and adds the Close functions as necessary.
In TransferOwnership the ctx.Repo.GitRepo is closed if it is open to help prevent the risk of multiple open files.
Fixes#7947
* Store original author info for migrated issues and comments
Keep original author name for displaying in Gitea interface and also
store original author user ID for potential future use in linking
accounts from old location.
* Add original_url for repo
Store the original URL for a migrated repo
Clean up migrations/tests
* fix migration
* fix golangci-lint
* make 'make revive' happy also
* Modify templates to use OriginalAuthor if set
Use the original author name in templates if it is set rather than the
user who migrated/currently owns the issues
* formatting fixes
* make generate-swagger
* Use default avatar for imported comments
* Remove no longer used IgnoreIssueAuthor option
* Add OriginalAuthorID to swagger also