* Use stricter boundaries for auto-link detection
Currently autolinks use \W for boundary detection which creates many
situations of inserting links into places they don't belong (paths,
URLs, UUIDs, etc...)
This fixes that by replacing \W and only allowing these matches to touch
an open paren or bracket (matching what seems to be Github behavior) in
addition to whitespace and start of line. Similar for ending boundary as
well.
Fixes#6149
(and probably others)
* Update test
Replace incorrect test with a value that is a valid username, based on:
"Username should contain only alphanumeric, dash ('-'), underscore ('_')
and dot ('.') characters."
* Also allow for period at the end
Matching Github behavior
* Fix email regex to work properly with specificed boundaries
Create a specific capture group for email address and then use
FindStringSubmatchIndex to allow for non-matching patterns as
boundaries.
* Add Tests
Add tests for new behavior -- including tests for email addresses which
were absent before.
This improves the SHA1 link detection to not pick up extraneous
non-whitespace characters at the end of the URL. The '.' is a special
case handled in code itself because of missing regexp lookahead
support.
Regex test cases: https://regex101.com/r/xUMlqh/3
* Migrate to go modules
* make vendor
* Update mvdan.cc/xurls
* make vendor
* Update code.gitea.io/git
* make fmt-check
* Update github.com/go-sql-driver/mysql
* make vendor
* Replace linkRegex with xurls library
Rather than maintaining a complicated regex to match URLs for
autolinking, gitea can use this existing go library that takes care of
the matching with very little code change to gitea itself. After
spending a while trying to find the perfect regex for all cases this library
still works better as it is more flexible than a single regex ever will be.
This will also fix the following issues: #5844#3095#3381
This passes all our current tests and I've added new ones mentioned in
those issues as well.
* Use xurls.StrictMatchingScheme instead of xurls.Strict
This is much faster and we only care about https? links to preserve
existing behavior.
The visitLinksForShortLinks feature would look inside of an <a> tag and
run shortLinkProcessorFull on any text, which attempts to create links
out of potential 'short links' like [[test]] [[link|example]] etc...
This makes no sense because you can't have nested links within an <a>
tag. Specifically, the html5 standard says <a> tags can't include
interactive content if they contain the href attribute:
http://w3c.github.io/html/single-page.html#the-a-element
And also defines an <a> element with a href attribute as interactive:
http://w3c.github.io/html/single-page.html#interactive-content
Therefore you can't really put a link inside of another link. In
practice none of this works anyways since browsers won't render it, it
would probably be broken if they tried, and it is causing a bug
(#4946). No current tests rely on this behavior either.
This removes the feature and also explicitly excludes the
current visitNodeForShortLinks from looking in <a> tags.
Modify the current linkRegex to require http|https which appears to be
the intended behavior based on the comments. Right now, it also matches
anything starting with www as well. Also add testing for linkRegex
* Get rid of autolink
* autolink in markdown
* Replace email addresses with mailto links
* better handling of links
* Remove autolink.js from footer
* Refactor entire html.go
* fix some bugs
* Make tests green, move what we can to html_internal_test, various other changes to processor logic
* Make markdown tests work again
This is just a description to allow me to force push in order to restart
the drone build.
* Fix failing markdown tests in routers/api/v1/misc
* Add license headers, log errors, future-proof <body>
* fix formatting
* restructure markup & markdown to prepare for multiple markup languages support
* adjust some functions between markdown and markup
* fix tests
* improve the comments
This changes the regex to look for a hash from 7 to 40 characters,
to match the use of abbreviated hash lookups in both git and github.
The restriction of not being a pure number is also removed because
1234567 is now considered a valid abbreviated hash, as is deadbeef.
A note has been added to the top of the code to state that the
literal regex match is fine, but no extra validation is currently
performed so some false positives are expected.
A future change could ensure that the hash exists in the repository
before rendering it as a link, although this might incur a slight
performance penalty.
Reverts part of commit 4a46613 and fixes#2053.
* Fix commit sha1 URL rendering in markdown
* Add unit test for commit sha1 markdown rendering when sha1 has space before it
* Change to better variable name
* Markdown rendering overhaul
Cleaned up and squashed commits into single one.
Signed-off-by: Andrew Boyarshin <boyarshinand@gmail.com>
* Fix markdown API, add markdown module and API tests, improve code coverage
Signed-off-by: Andrew Boyarshin <boyarshinand@gmail.com>
Replace spaces with "%20" in "urlPrefix", before markdon processing.
The spaces were causing blackfriday (markdown processor) to behave
strange. This fixes#2545.