首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >标记静态博客程序

标记静态博客程序
EN

Code Review用户
提问于 2011-12-16 16:39:17
回答 3查看 765关注 0票数 11

我希望能让Stack溢出中的一些人了解一下我的Python静态博客应用程序。我已经用了好几年了。最近,我决定把它清理干净,并把它放在吉特布身上。我希望一些更聪明的Python程序员能给我一些建议和智慧,帮助我改进、优化和简化代码。

程序在这里:https://github.com/mshea/Pueblo

一些哲学:

  1. 我不想要更多的特征。我要尽可能的简单。
  2. 我总是喜欢本地模块。我的ISP不允许我安装新模块,所以Markdown模块是我在缺省值之外唯一使用的模块。
  3. 我想把它保留在一个脚本上,除非将它分开使事情变得简单或简单。
  4. 我对任何潜在的安全问题都特别感兴趣。现在我什么都没看到。
  5. 我不太喜欢灵活性。我宁愿用一种方式做得很好,而不是很多方法都很糟糕。如果人们想要一个灵活的博客平台,可以使用WordPress。
代码语言:javascript
复制
#!/usr/local/bin/python
#
# Pueblo: Python Markdown Static Blogger    
#
# 17 December 2011
#
# A single Python script to build a simple blog from a directory full of markdown files.
#
# This script requires the Markdown python implementation available at:
# http://pypi.python.org/pypi/Markdown/2.1.0
#
# This script requires markdown files using the following multimarkdown metadata as the first three lines
# of the processed .txt markdown files as follows:
#
# Title: the Title of your Document
# Author: Joe Blow
# Date: 15 December 2011
#
# The program will generate an index.html homepage file, an archive.html archive file, 
# and an index.xml RSS file.
#
# Header and footer data can be edited in the variables throughout the program. 
#
# This script expects the following additional files:
# style.css: The main site's stylesheet.
# iphone.css: The mobile version of the site's stylesheet.
# sidebar.html: A secondary set of data usually displayed as a sidebar.
#
# Instructions
# Install the Markdown python module.
# Configure this script by changing the configuration variables below.
# Put your static markdown .txt files in the configured directory
# Run the script either manually, with a regular cronjob, or as a CGI script.
# View the output at index.html

config = {
    "directory": ".", # No trailing slash.
    "site_url": "http://yoursite.net/", # Must have a trailing slash.
    "site_title": "Your Website",
    "site_description": "Your blog tagline.",
    "google_analytics_tag": "UA-111111-1",
    "author_name": "Your Name",
    "author_bio_link": "about.html",
    "amazon_tag": "mikesheanet-20",
    "twitter_tag": "twitterid",
    "author_email": "your@emailaddress.com",
    "header_image_url": "",
    "header_image_width": "",
    "header_image_height": "",
    "sidebar_on_article_pages": False,
    "minify_html": False,
}

nonentryfiles = []

# Main Program
import glob, re, rfc822, time, cgi, datetime, markdown
from time import gmtime, strftime, localtime, strptime
def rebuildsite ():
    textfiles = glob.glob(config["directory"]+"//*.txt")
    for nonfile in nonentryfiles:
        textfiles.remove(config["directory"]+"/"+nonfile)
    indexdata = []

    # Rip through the stack of .txt markdown files and build HTML pages from it.
    for eachfile in textfiles:
        eachfile = eachfile.replace(config["directory"]+"\\", "")
        content = open(eachfile).read()
        lines = re.split("\n", content)
        title = re.sub("(Title: )|(  )", "", lines[0])
        title = cgi.escape(title)
        urltitle = title.replace("&", "%26")
        author = lines[1].replace("Author: ","")
        date = re.sub("(  )|(\n)|(Date: )","",lines[2])
        numdate = strftime("%Y-%m-%d", strptime(date, "%d %B %Y"))
        content = markdown.markdown(re.sub("(Title:.*\n)|(Author:.*\n)|(Date:.*\n\n)|    ", "", content))
        summary = re.sub("<[^<]+?>","", content)
        summary = summary.replace("\n", " ")[0:200]
        htmlfilenamefull = htmlfilename = eachfile.replace(".txt", ".html")
        htmlfilename = htmlfilename.replace(config["directory"]+"/", "")
        postname = htmlfilename.replace(".html", "")
        # Build the HTML file, add a bit of footer text.
        htmlcontent = [buildhtmlheader("article", title, date)]
        htmlcontent.append(content)
        htmlcontent.append(buildhtmlfooter("article", urltitle))
        htmlfile = open(htmlfilenamefull, "w")
        htmlfile.write(minify("".join(htmlcontent)))
        htmlfile.close()
        if numdate <= datetime.datetime.now().strftime("%Y-%m-%d"):
            indexdata.append([[numdate],[title],[summary],[htmlfilename],[content]])

    # The following section builds index.html, archive.html and index.xml.  
    indexdata.sort()
    indexdata.reverse()
    indexbody=archivebody=rssbody=""
    count=0

    for indexrow in indexdata:
        dateobject = strptime(indexrow[0][0], "%Y-%m-%d")
        rssdate = strftime("%a, %d %b %Y 06:%M:%S +0000", dateobject)
        nicedate = strftime("%d %B %Y", dateobject)
        articleitem = '''
<h2><a href="%(article_link)s">%(article_title)s</a></h2>
<p>%(date)s - %(summary)s...</p>
'''     % {
        'article_link': indexrow[3][0],
        'article_title': indexrow[1][0],
        'date': nicedate,
        'summary': indexrow[2][0],
        }

        rssitem = '''
<item>
<title>%(title)s</title>
<link>%(link)s</link>
<guid>%(link)s</guid>
<pubDate>%(pubdate)s</pubDate>
<description>%(description)s</description>
<content:encoded>
<![CDATA[%(cdata)s]]>
</content:encoded>
</item>
'''     % {
        'title': indexrow[1][0],
        'link': config["site_url"]+indexrow[3][0],
        'pubdate': rssdate,
        'description': indexrow[2][0],
        'cdata': indexrow[4][0],
        }

        count = count + 1
        if count < 15:
            rssbody = rssbody + rssitem
        if count < 30:
            indexbody = indexbody+articleitem
        archivebody = archivebody + articleitem
    sidebardata = open(config["directory"]+"/sidebar.html").read()
    rssdatenow = rfc822.formatdate()

    indexdata = [buildhtmlheader("index", config["site_title"], "none")]
    indexdata.append(indexbody)
    indexdata.append("<h2><a href=\"archive.html\">View All %(article_count)s Articles</a></h2>\n</div>\n" 
        % { 'article_count': str(count) })
    indexdata.append(buildhtmlfooter("index", ""))
    indexfile = open(config["directory"]+"/index.html", "w").write(minify("".join(indexdata)))

    archivedata = [buildhtmlheader("archive", config["site_title"]+" Article Archive", "none")]
    archivedata.append(archivebody)
    archivedata.append("\n</div>\n")
    archivedata.append(buildhtmlfooter("archive", ""))
    archivefile = open (config["directory"]+"/archive.html", "w").write(minify("".join(archivedata)))

    rsscontent = '''<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/elements/1.1/\"
>

<channel>
<title>%(site_title)s</title>
<link>%(site_url)s</link>
<description>%(site_description)s</description>
<pubDate>%(rssdatenow)s</pubDate>
<language>en</language>
<atom:link href="%(site_url)sindex.xml" rel="self" type="application/rss+xml" />
%(rssbody)s
</channel>
</rss>
''' % {
    'site_url': config["site_url"],
    'site_title': config["site_title"],
    'site_description': config["site_description"],
    'rssdatenow': rssdatenow,
    'rssbody': rssbody,
    }

    rssfile = open(config["directory"]+"/index.xml", "w").write(minify(rsscontent))

# Subroutine to build out the page's HTML header
def buildhtmlheader(type, title, date):
    if config["header_image_url"] != "":
        headerimage = '''
<img class="headerimg" src="%(header_image_url)s" alt="%(site_title)s: %(site_description)s" height="%(header_image_height)s" width="%(header_image_width)s" />
'''     % {
        'header_image_url': config["header_image_url"],
        'site_title': config["site_title"],
        'site_description': config["site_description"],
        'header_image_height': config["header_image_height"],
        'header_image_width': config["header_image_width"],
        }

    htmlheader = ['''
<!DOCTYPE html>
<html>
<head>
<title>%(title)s</title>
<link rel="stylesheet" type="text/css" media="screen and (min-width: 481px)" href="style.css">
<link rel="stylesheet" type="text/css" media="only screen and (max-width: 480px)" href="iphone.css">
<link rel="alternate" type="application/rss+xml" title="%(title)s" href="index.xml">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="viewport" content="user-scalable=no, width=device-width" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push(['_setAccount', '%(google_analytics_tag)s']);
_gaq.push(['_trackPageview']);
(function() {  var ga = document.createElement('script');
 ga.type = 'text/javascript';
 ga.async = true;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script')[0];
s.parentNode.insertBefore(ga, s);
})();
</script>
</head>
<body>
'''     % { 
        'title': title, 
        'google_analytics_tag': config["google_analytics_tag"], 
        } ]

    # Tons of conditional checks lay ahead. Does it use a header image 
    # and do you want the sidebar on article pages?
    if config["sidebar_on_article_pages"] != True and type == "article":
        htmlheader.append("\n<div class=\"article_container\">\n")
    else:
        htmlheader.append("\n<div class=\"container\">\n")
    if config["header_image_url"] != "" and type == "index":
        htmlheader.append(headerimage)
    elif config["header_image_url"] != "" and type != "index":
        htmlheader.append("<a href=\"/\">\n" + headerimage + "</a>\n")
    elif config["header_image_url"] == "" and type == "index":

        htmlheader.append('''
<div class="header">
<h1>%(site_title)s</h1>
<p>%(site_description)s</p>
</div>
'''     % {
        'site_title': config["site_title"],
        'site_description': config["site_description"],
        } )

    elif config["header_image_url"] == "" and type != "index":
        htmlheader.append('''
<p class="return_link">
<a href="index.html">%(site_title)s</a>
</p>
'''     % {
        'site_title': config["site_title"]
        } )
    if type == "index":
        htmlheader.append("\n<div class=\"article_list\">\n")
    elif type == "archive":
        htmlheader.append("\n<div class=\"article_list\">\n<h1>Article Archive</h1>\n")
    elif type == "article":
        htmlheader.append('''
<div class="article">
<h1>%(title)s</h1>
<p>by <a href="%(author_bio_link)s">%(author_name)s</a> on %(date)s</p>
'''     % {
        'author_bio_link': config["author_bio_link"],
        'title': title,
        'author_name': config["author_name"],
        'date': date,
        } )
    return "".join(htmlheader)

# Subroutine to remove all line breaks to make for some packed fast HTML
def minify(content):
    if config["minify_html"]:
        content = re.sub("\n","",content)
    return content

# Subroutine to build out the footer.
def buildhtmlfooter (type, urltitle):
    footer_parts = []
    sidebardata = open(config["directory"]+"/sidebar.html").read()
    if type == "index" or type == "archive" or config["sidebar_on_article_pages"]:
        footer_parts.append(sidebardata)
    if type == "article":
        footer_parts.append(
'''
<p>Send feedback to <a href="mailto:%(email)s">%(email)s</a> or <a href="http://twitter.com/share?via=%(twitter_tag)s&text=%(urltitle)s">share on twitter</a>.</p>
'''     % {
        'email': config['author_email'], 
        'twitter_tag': config['twitter_tag'], 
        'urltitle': urltitle,
        })
    footer_parts.append("\n</div>\n</body>\n</html>")
    return "".join(footer_parts)

# This program is designed to run as a CGI script so you can rebuild your site by hitting a URL.
print "Content-type: text/html\n\n"
rebuildsite()
print "<html><head><title>Site Rebuilt</title></head><body><h1>Site Rebuilt</h1></body></html>"
EN

回答 3

Code Review用户

发布于 2011-12-17 02:09:25

以下是一些建议:

字符串比较应该使用== (和!=),而不是is (和is not)。is可以工作,但意味着您正在比较标识(通常是一个内存地址),而==则是比较值。有关详细信息,请参阅https://stackoverflow.com/a/2988117/331473

您的布尔信任(例如:minify_html)应该是实际的布尔值True/False,而不是1/0。此外,当您检查这些,您应该放弃比较。示例:

代码语言:javascript
复制
if minify_html == 1:     # or minify_html == True 
   ...                   #    (if you've converted these to booleans)

可以写成:

代码语言:javascript
复制
if minify_html:
    ...

使用模块级别的vars作为信任通常可以用于一些事情,但是一旦您有了一个完整的目录,它就会有一些需要跟踪的地方。在查看您的代码时,我多次问自己“var从哪里来的?”

如果你想解决这个问题,你可以把这些放在字典里,这样就可以说它是“命名空间”了。示例:

代码语言:javascript
复制
config = {
    "site_url": "...",
    "site_name": "..."
}

然后,在您的代码中,您可以更容易地发现配置位:

代码语言:javascript
复制
if config['minify_html']:
    ....

还有更多的清理工作可以做,但这只是我最先想到的几件事。还有一件事。我现在没有时间来解决这个问题,但是if/elses在buildhtmlheader中的长链可能会被重新分解,以使事情少一点冗余。

票数 8
EN

Code Review用户

发布于 2011-12-18 09:34:29

  1. 我更喜欢强调大写的常量。
  2. 导入通常应该在单独的行上,例如:是:导入os导入sys No: import sys,os
  3. 使用空格而不是制表符。
  4. 避免将变量命名为内置函数/对象。我是说file。如果您稍后需要此函数,则需要时间来找出它不工作的原因。
  5. 如果发生异常,With关键字将自动关闭文件。它更适合打开文件。打开(文件名,'r')为f: content = f.read()
  6. 如果您需要导入您的模块或它的一部分,您的代码每次都会执行rebuildsite。要防止这种情况,请使用__name__ ==‘__main_’:print "Content-type: text/html\n\n“rebuildsite() print”站点重建“
  7. 使用os.path.join os.path.join(目录'sidebar.html')连接路径更好
  8. 在python中连接字符串的更有效的方法是加入一个列表,用三元引号表示新行,%用于格式化: parts = “%”(头衔)S‘%{’标题‘:标题} (如果不是SIDEBAR_ON_ARTICLE_PAGES ),并键入==“parts.append”(‘’)返回‘. join’(Patrs)

遵循其他PEP8建议

票数 7
EN

Code Review用户

发布于 2011-12-18 23:10:24

到目前为止,我喜欢其他两个答案,而且我看到你已经在其中添加了一些建议。

我有两个小的补充:

  1. 这可能更短,但更难读:对于非nonentryfiles中的非文件: textfiles.remove(config“目录”+"/"+nonfile) #,除了非条目文件之外,我猜您来自一种语言,它强调要考虑,但在python中,您更愿意做一些在非nonentryfiles中对非文件进行调整时更容易理解的事情: textfiles.remove(config“目录” + "/“+非文件)新行空格可以使它更容易阅读。(此外,我还在字符串连接中添加了空格。也使阅读更容易。)另一个空格就是那些如果是其他人的.很难找到一个开始而另一个结束的地方。试一试:如果测试:.其他:..。如果测试:..。还有别的东西:..。使查找每个if语句更容易。
  2. 评论每一行,比如:# Open up the file并没有真正的帮助。这只会使有用的评论更难在所有的噪音中找到。相反,尝试找到概括代码部分并删除其余部分的注释。或者如果你不得不做一些不寻常的事,也要把它们留在家里。想一想“如果我不知道这段代码做了什么,我需要解释哪些行”。
票数 5
EN
页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://codereview.stackexchange.com/questions/6923

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档