我希望能让Stack溢出中的一些人了解一下我的Python静态博客应用程序。我已经用了好几年了。最近,我决定把它清理干净,并把它放在吉特布身上。我希望一些更聪明的Python程序员能给我一些建议和智慧,帮助我改进、优化和简化代码。
程序在这里:https://github.com/mshea/Pueblo
一些哲学:
#!/usr/local/bin/python
#
# Pueblo: Python Markdown Static Blogger
#
# 17 December 2011
#
# A single Python script to build a simple blog from a directory full of markdown files.
#
# This script requires the Markdown python implementation available at:
# http://pypi.python.org/pypi/Markdown/2.1.0
#
# This script requires markdown files using the following multimarkdown metadata as the first three lines
# of the processed .txt markdown files as follows:
#
# Title: the Title of your Document
# Author: Joe Blow
# Date: 15 December 2011
#
# The program will generate an index.html homepage file, an archive.html archive file,
# and an index.xml RSS file.
#
# Header and footer data can be edited in the variables throughout the program.
#
# This script expects the following additional files:
# style.css: The main site's stylesheet.
# iphone.css: The mobile version of the site's stylesheet.
# sidebar.html: A secondary set of data usually displayed as a sidebar.
#
# Instructions
# Install the Markdown python module.
# Configure this script by changing the configuration variables below.
# Put your static markdown .txt files in the configured directory
# Run the script either manually, with a regular cronjob, or as a CGI script.
# View the output at index.html
config = {
"directory": ".", # No trailing slash.
"site_url": "http://yoursite.net/", # Must have a trailing slash.
"site_title": "Your Website",
"site_description": "Your blog tagline.",
"google_analytics_tag": "UA-111111-1",
"author_name": "Your Name",
"author_bio_link": "about.html",
"amazon_tag": "mikesheanet-20",
"twitter_tag": "twitterid",
"author_email": "your@emailaddress.com",
"header_image_url": "",
"header_image_width": "",
"header_image_height": "",
"sidebar_on_article_pages": False,
"minify_html": False,
}
nonentryfiles = []
# Main Program
import glob, re, rfc822, time, cgi, datetime, markdown
from time import gmtime, strftime, localtime, strptime
def rebuildsite ():
textfiles = glob.glob(config["directory"]+"//*.txt")
for nonfile in nonentryfiles:
textfiles.remove(config["directory"]+"/"+nonfile)
indexdata = []
# Rip through the stack of .txt markdown files and build HTML pages from it.
for eachfile in textfiles:
eachfile = eachfile.replace(config["directory"]+"\\", "")
content = open(eachfile).read()
lines = re.split("\n", content)
title = re.sub("(Title: )|( )", "", lines[0])
title = cgi.escape(title)
urltitle = title.replace("&", "%26")
author = lines[1].replace("Author: ","")
date = re.sub("( )|(\n)|(Date: )","",lines[2])
numdate = strftime("%Y-%m-%d", strptime(date, "%d %B %Y"))
content = markdown.markdown(re.sub("(Title:.*\n)|(Author:.*\n)|(Date:.*\n\n)| ", "", content))
summary = re.sub("<[^<]+?>","", content)
summary = summary.replace("\n", " ")[0:200]
htmlfilenamefull = htmlfilename = eachfile.replace(".txt", ".html")
htmlfilename = htmlfilename.replace(config["directory"]+"/", "")
postname = htmlfilename.replace(".html", "")
# Build the HTML file, add a bit of footer text.
htmlcontent = [buildhtmlheader("article", title, date)]
htmlcontent.append(content)
htmlcontent.append(buildhtmlfooter("article", urltitle))
htmlfile = open(htmlfilenamefull, "w")
htmlfile.write(minify("".join(htmlcontent)))
htmlfile.close()
if numdate <= datetime.datetime.now().strftime("%Y-%m-%d"):
indexdata.append([[numdate],[title],[summary],[htmlfilename],[content]])
# The following section builds index.html, archive.html and index.xml.
indexdata.sort()
indexdata.reverse()
indexbody=archivebody=rssbody=""
count=0
for indexrow in indexdata:
dateobject = strptime(indexrow[0][0], "%Y-%m-%d")
rssdate = strftime("%a, %d %b %Y 06:%M:%S +0000", dateobject)
nicedate = strftime("%d %B %Y", dateobject)
articleitem = '''
<h2><a href="%(article_link)s">%(article_title)s</a></h2>
<p>%(date)s - %(summary)s...</p>
''' % {
'article_link': indexrow[3][0],
'article_title': indexrow[1][0],
'date': nicedate,
'summary': indexrow[2][0],
}
rssitem = '''
<item>
<title>%(title)s</title>
<link>%(link)s</link>
<guid>%(link)s</guid>
<pubDate>%(pubdate)s</pubDate>
<description>%(description)s</description>
<content:encoded>
<![CDATA[%(cdata)s]]>
</content:encoded>
</item>
''' % {
'title': indexrow[1][0],
'link': config["site_url"]+indexrow[3][0],
'pubdate': rssdate,
'description': indexrow[2][0],
'cdata': indexrow[4][0],
}
count = count + 1
if count < 15:
rssbody = rssbody + rssitem
if count < 30:
indexbody = indexbody+articleitem
archivebody = archivebody + articleitem
sidebardata = open(config["directory"]+"/sidebar.html").read()
rssdatenow = rfc822.formatdate()
indexdata = [buildhtmlheader("index", config["site_title"], "none")]
indexdata.append(indexbody)
indexdata.append("<h2><a href=\"archive.html\">View All %(article_count)s Articles</a></h2>\n</div>\n"
% { 'article_count': str(count) })
indexdata.append(buildhtmlfooter("index", ""))
indexfile = open(config["directory"]+"/index.html", "w").write(minify("".join(indexdata)))
archivedata = [buildhtmlheader("archive", config["site_title"]+" Article Archive", "none")]
archivedata.append(archivebody)
archivedata.append("\n</div>\n")
archivedata.append(buildhtmlfooter("archive", ""))
archivefile = open (config["directory"]+"/archive.html", "w").write(minify("".join(archivedata)))
rsscontent = '''<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/\"
>
<channel>
<title>%(site_title)s</title>
<link>%(site_url)s</link>
<description>%(site_description)s</description>
<pubDate>%(rssdatenow)s</pubDate>
<language>en</language>
<atom:link href="%(site_url)sindex.xml" rel="self" type="application/rss+xml" />
%(rssbody)s
</channel>
</rss>
''' % {
'site_url': config["site_url"],
'site_title': config["site_title"],
'site_description': config["site_description"],
'rssdatenow': rssdatenow,
'rssbody': rssbody,
}
rssfile = open(config["directory"]+"/index.xml", "w").write(minify(rsscontent))
# Subroutine to build out the page's HTML header
def buildhtmlheader(type, title, date):
if config["header_image_url"] != "":
headerimage = '''
<img class="headerimg" src="%(header_image_url)s" alt="%(site_title)s: %(site_description)s" height="%(header_image_height)s" width="%(header_image_width)s" />
''' % {
'header_image_url': config["header_image_url"],
'site_title': config["site_title"],
'site_description': config["site_description"],
'header_image_height': config["header_image_height"],
'header_image_width': config["header_image_width"],
}
htmlheader = ['''
<!DOCTYPE html>
<html>
<head>
<title>%(title)s</title>
<link rel="stylesheet" type="text/css" media="screen and (min-width: 481px)" href="style.css">
<link rel="stylesheet" type="text/css" media="only screen and (max-width: 480px)" href="iphone.css">
<link rel="alternate" type="application/rss+xml" title="%(title)s" href="index.xml">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="user-scalable=no, width=device-width" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', '%(google_analytics_tag)s']);
_gaq.push(['_trackPageview']);
(function() { var ga = document.createElement('script');
ga.type = 'text/javascript';
ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body>
''' % {
'title': title,
'google_analytics_tag': config["google_analytics_tag"],
} ]
# Tons of conditional checks lay ahead. Does it use a header image
# and do you want the sidebar on article pages?
if config["sidebar_on_article_pages"] != True and type == "article":
htmlheader.append("\n<div class=\"article_container\">\n")
else:
htmlheader.append("\n<div class=\"container\">\n")
if config["header_image_url"] != "" and type == "index":
htmlheader.append(headerimage)
elif config["header_image_url"] != "" and type != "index":
htmlheader.append("<a href=\"/\">\n" + headerimage + "</a>\n")
elif config["header_image_url"] == "" and type == "index":
htmlheader.append('''
<div class="header">
<h1>%(site_title)s</h1>
<p>%(site_description)s</p>
</div>
''' % {
'site_title': config["site_title"],
'site_description': config["site_description"],
} )
elif config["header_image_url"] == "" and type != "index":
htmlheader.append('''
<p class="return_link">
<a href="index.html">%(site_title)s</a>
</p>
''' % {
'site_title': config["site_title"]
} )
if type == "index":
htmlheader.append("\n<div class=\"article_list\">\n")
elif type == "archive":
htmlheader.append("\n<div class=\"article_list\">\n<h1>Article Archive</h1>\n")
elif type == "article":
htmlheader.append('''
<div class="article">
<h1>%(title)s</h1>
<p>by <a href="%(author_bio_link)s">%(author_name)s</a> on %(date)s</p>
''' % {
'author_bio_link': config["author_bio_link"],
'title': title,
'author_name': config["author_name"],
'date': date,
} )
return "".join(htmlheader)
# Subroutine to remove all line breaks to make for some packed fast HTML
def minify(content):
if config["minify_html"]:
content = re.sub("\n","",content)
return content
# Subroutine to build out the footer.
def buildhtmlfooter (type, urltitle):
footer_parts = []
sidebardata = open(config["directory"]+"/sidebar.html").read()
if type == "index" or type == "archive" or config["sidebar_on_article_pages"]:
footer_parts.append(sidebardata)
if type == "article":
footer_parts.append(
'''
<p>Send feedback to <a href="mailto:%(email)s">%(email)s</a> or <a href="http://twitter.com/share?via=%(twitter_tag)s&text=%(urltitle)s">share on twitter</a>.</p>
''' % {
'email': config['author_email'],
'twitter_tag': config['twitter_tag'],
'urltitle': urltitle,
})
footer_parts.append("\n</div>\n</body>\n</html>")
return "".join(footer_parts)
# This program is designed to run as a CGI script so you can rebuild your site by hitting a URL.
print "Content-type: text/html\n\n"
rebuildsite()
print "<html><head><title>Site Rebuilt</title></head><body><h1>Site Rebuilt</h1></body></html>"发布于 2011-12-17 02:09:25
以下是一些建议:
字符串比较应该使用== (和!=),而不是is (和is not)。is可以工作,但意味着您正在比较标识(通常是一个内存地址),而==则是比较值。有关详细信息,请参阅https://stackoverflow.com/a/2988117/331473。
您的布尔信任(例如:minify_html)应该是实际的布尔值True/False,而不是1/0。此外,当您检查这些,您应该放弃比较。示例:
if minify_html == 1: # or minify_html == True
... # (if you've converted these to booleans)可以写成:
if minify_html:
...使用模块级别的vars作为信任通常可以用于一些事情,但是一旦您有了一个完整的目录,它就会有一些需要跟踪的地方。在查看您的代码时,我多次问自己“var从哪里来的?”
如果你想解决这个问题,你可以把这些放在字典里,这样就可以说它是“命名空间”了。示例:
config = {
"site_url": "...",
"site_name": "..."
}然后,在您的代码中,您可以更容易地发现配置位:
if config['minify_html']:
....还有更多的清理工作可以做,但这只是我最先想到的几件事。还有一件事。我现在没有时间来解决这个问题,但是if/elses在buildhtmlheader中的长链可能会被重新分解,以使事情少一点冗余。
发布于 2011-12-18 09:34:29
file。如果您稍后需要此函数,则需要时间来找出它不工作的原因。With关键字将自动关闭文件。它更适合打开文件。打开(文件名,'r')为f: content = f.read()rebuildsite。要防止这种情况,请使用__name__ ==‘__main_’:print "Content-type: text/html\n\n“rebuildsite() print”站点重建“os.path.join os.path.join(目录'sidebar.html')连接路径更好遵循其他PEP8建议
发布于 2011-12-18 23:10:24
到目前为止,我喜欢其他两个答案,而且我看到你已经在其中添加了一些建议。
我有两个小的补充:
# Open up the file并没有真正的帮助。这只会使有用的评论更难在所有的噪音中找到。相反,尝试找到概括代码部分并删除其余部分的注释。或者如果你不得不做一些不寻常的事,也要把它们留在家里。想一想“如果我不知道这段代码做了什么,我需要解释哪些行”。https://codereview.stackexchange.com/questions/6923
复制相似问题