我需要一个字符串集的精确定义,这些字符串被认为是models.URLField的合法urls。
就我而言,精确的定义可以是一个正则表达式,也可以是一段验证url有效性的代码,或者任何其他形式的准确解释。
发布于 2011-04-24 23:51:45
通过查看源代码,似乎URLField是一个CharField,它将一个validators.URLValidator添加到字段的验证器列表中,这是从django.core导入的。
来自site-packages/django/core/validators.py
class URLValidator(RegexValidator):
regex = re.compile(
r'^https?://' # http:// or https://
r'(?:(?:[A-Z0-9](?:[A-Z0-9-]{0,61}[A-Z0-9])?\.)+[A-Z]{2,6}\.?|' #domain...
r'localhost|' #localhost...
r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})' # ...or ip
r'(?::\d+)?' # optional port
r'(?:/?|[/?]\S+)$', re.IGNORECASE)
def __init__(self, verify_exists=False, validator_user_agent=URL_VALIDATOR_USER_AGENT):
super(URLValidator, self).__init__()
self.verify_exists = verify_exists
self.user_agent = validator_user_agent
def __call__(self, value):
try:
super(URLValidator, self).__call__(value)
except ValidationError, e:
# Trivial case failed. Try for possible IDN domain
if value:
value = smart_unicode(value)
scheme, netloc, path, query, fragment = urlparse.urlsplit(value)
try:
netloc = netloc.encode('idna') # IDN -> ACE
except UnicodeError: # invalid domain part
raise e
url = urlparse.urlunsplit((scheme, netloc, path, query, fragment))
super(URLValidator, self).__call__(url)
else:
raise
else:
url = value
if self.verify_exists:
import urllib2
headers = {
"Accept": "text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5",
"Accept-Language": "en-us,en;q=0.5",
"Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.7",
"Connection": "close",
"User-Agent": self.user_agent,
}
try:
req = urllib2.Request(url, None, headers)
u = urllib2.urlopen(req)
except ValueError:
raise ValidationError(_(u'Enter a valid URL.'), code='invalid')
except: # urllib2.URLError, httplib.InvalidURL, etc.
raise ValidationError(_(u'This URL appears to be a broken link.'), code='invalid_link')因此,它尝试验证正则表达式,如果验证失败,它会使用Python的urlparse模块取消域名的IDN,然后重试。
https://stackoverflow.com/questions/5771472
复制相似问题