This project is read-only.
2
Vote

UrlEncode urls to be crawled

description

Urls found in an html document with spaces or characters like "å ä ö" won't pass the Uri.IsWellFormedUri check and therefore won't be added to a crawl step.

comments

ivanlewis wrote Aug 24, 2012 at 9:15 AM

Please add below line

url = Uri.EscapeUriString(url).ToLower();

in 'public static string NormalizeUri(this string url, string baseUrl)' method
in class 'NCrawler.Extensions'

wrote Feb 21, 2013 at 11:52 PM