我正在编写一个脚本,它将checkout一个git用于某些提交哈希,做一些事情并切换回master。这个脚本的目的是采取学生的家庭作业解决方案的比特桶。请注意,所有的repos都在同一个bitbucket帐户下。有一个主bitbucket帐户,它是所有这些repos的管理员,学生可以对他们各自的回购进行写访问。学生在复习时必须遵守以下目录结构:
-assignments
|- assignment-1
|- assignment-2
.
.
.
|- assignment-X里面的目录包含家庭作业。一旦老师给出了最后期限,学生们必须在截止日期之前提交他们的代码。脚本将看到git日志,找到在截止日期之前完成的提交,切换到该修订版,并将解决方案rsync到本地目录。
因此,这个脚本将:
students-info.json)中获取bitbucket回购名称的列表。rsync执行到solutions-directory/assignment-x-deadline/student-id我正在寻找任何提示,建议,一般代码改进,bug,任何东西。
这是我的代码:
#!/bin/python
"""
This script will take assignment solutions from each student repository. Based
on the timestamp given, it finds out the last commit made before timestamp
(i.e. deadline) and it checks out that revision, rsyncs the solution folder
of the required assignment with the solutions-repo and resets to HEAD.
The timestamp should be of the format 'Month Date H:M:S Year'
eg. Dec 19 22:31:01 2013
Input : List of students ids, assignment-id, timestamp
Example usage: To take out solutions of assignment 11 whose deadline was
Dec 19 22:31:01 2013, run the following
$python take_solutions.py -d 'Dec 19 22:31:01 2013' -a 'assignment-11'
To do:
- git_log_cmd with format string in get_commit_hash()
- dest_path is ugly
- dest_path should be global?
-
"""
import string
import os
import time
import datetime
import subprocess
import json
import argparse
import shlex
import logging
import datetime
from logging.handlers import TimedRotatingFileHandler
from dir_settings import *
from bb_settings import *
parser = argparse.ArgumentParser(description='This script will take assignment solutions from each student repository. Based on the timestamp given, it finds out the last commit made before timestamp (i.e. deadline) and it checks out that revision, rsyncs the solution folder of the required assignment with the solutions-repo and resets to HEAD.')
parser.add_argument('-d','--deadline', help='The timestamp should be of the \
format "Month Date H:M:S Year" e.g. "Dec 19 22:31:01 2013"',
required=True)
parser.add_argument('-a','--assignment_id', help='Please provide assignment \
id of the solutions you want to copy. e.g. assignment-7',
required=True)
NITRO_LOGGER = logging.getLogger('NITRO')
LOG_FILENAME = 'nitro.log'
SOLUTIONS_DIRECTORY = 'solutions-directory/'
STUDENTS_REPO_DIRECTORY = 'students-repo-directory/'
students_info = json.loads(open(STUDENTS_INFO, 'r').read())
args = vars(parser.parse_args())
assignment_id = args['assignment_id']
deadline = args['deadline']
DEST_PATH = SOLUTIONS_DIRECTORY + assignment_id + '-' + '-'.join(deadline.split()) + '/'
def get_commit_hash(repo_name, timestamp):
git_log_cmd = shlex.split('git --git-dir=' + STUDENTS_REPO_DIRECTORY + repo_name + '/.git log --pretty=format:"%H %ad" --date=local')
try:
(output, error) = subprocess.Popen(git_log_cmd, stdout=subprocess.PIPE,
stderr=LOG_FD).communicate()
for git_log in string.split(output, os.linesep):
deadline = datetime.datetime.strptime(timestamp, "%b %d %H:%M:%S %Y")
# split the commit message by first white space, the returning list will
# have hash as its first element and timestamp as second element
commit_hash = git_log.split(' ', 1)[0]
commit_timestamp = git_log.split(' ', 1)[1]
if deadline > datetime.datetime.strptime(commit_timestamp, "%a %b %d %H:%M:%S %Y"):
return commit_hash
except Exception, e:
NITRO_LOGGER.error("Couldn't get commit hash before deadline for repo %s: %s" % (repo_name, str(e)))
#raise e
def sync_solutions(repo_name):
def repo_exists(repo_name):
return os.path.isdir(STUDENTS_REPO_DIRECTORY + repo_name)
def clone_repo(repo_name):
clone_cmd = shlex.split("git clone %s%s %s%s" % (BB_REPO_BASE_URL,
repo_name, STUDENTS_REPO_DIRECTORY, repo_name))
subprocess.check_call(clone_cmd, stdout=LOG_FD, stderr=LOG_FD)
def pull_repo(repo_name):
pull_cmd = shlex.split("git --git-dir=%s/.git pull" % \
(STUDENTS_REPO_DIRECTORY + repo_name))
subprocess.check_call(pull_cmd, stdout=LOG_FD, stderr=LOG_FD)
def checkout_version(repo_name, commit_hash='-'):
checkout_cmd = shlex.split("git --git-dir=%s/.git checkout %s" \
% ((STUDENTS_REPO_DIRECTORY + repo_name), commit_hash))
subprocess.check_call(checkout_cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
def rsync(repo_name):
src_path = STUDENTS_REPO_DIRECTORY + repo_name + '/assignments/' + assignment_id
if not os.path.isdir(src_path):
# either student messed up the dir structure or hasn't submitted his assignments
return
if not os.path.isdir(DEST_PATH + repo_name):
os.makedirs(DEST_PATH + repo_name)
rsync_cmd = shlex.split('rsync -rt %s %s' % (src_path, DEST_PATH + repo_name))
subprocess.check_call(rsync_cmd, stdout=LOG_FD, stderr=LOG_FD)
try:
if repo_exists(repo_name):
pull_repo(repo_name)
else:
clone_repo(repo_name)
except Exception, e:
NITRO_LOGGER.error('pull/clone repo failed for repo %s: %s', repo_name, str(e))
return
commit_hash = get_commit_hash(repo_name, deadline)
if commit_hash:
try:
checkout_version(repo_name, commit_hash)
rsync(repo_name)
checkout_version(repo_name)
except Exception, e:
NITRO_LOGGER.error('git checkout failed for repo %s: %s' % (repo_name, str(e)))
else:
NITRO_LOGGER.debug('No assignment found before deadline for ' + repo_name)
return
def setup_logging():
NITRO_LOGGER.setLevel(logging.DEBUG) # make log level a setting
# Add the log message handler to the logger
myhandler = TimedRotatingFileHandler(LOG_FILENAME, when='midnight',
backupCount=5)
formatter = logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s',
datefmt='%Y-%m-%d %I:%M:%S %p')
myhandler.setFormatter(formatter)
NITRO_LOGGER.addHandler(myhandler)
def init():
if not os.path.isdir(STUDENTS_REPO_DIRECTORY):
os.makedirs(STUDENTS_REPO_DIRECTORY)
if not os.path.isdir(SOLUTIONS_DIRECTORY):
os.makedirs(SOLUTIONS_DIRECTORY)
if not os.path.isdir(DEST_PATH):
os.makedirs(DEST_PATH)
def main():
NITRO_LOGGER.debug('****Firing up NITRO***')
init()
for student_id, student_email in students_info.iteritems():
NITRO_LOGGER.debug(student_id)
sync_solutions(student_id)
NITRO_LOGGER.debug('****Done with NITRO***')
if __name__ == '__main__':
LOG_FD = open(LOG_FILENAME, 'a')
setup_logging()
main()发布于 2014-02-13 07:26:59
检查提交的时间戳的想法在概念上是有缺陷的。Git是一个分布式版本控制系统,没有中央服务器或任何其他公证时间戳的方法。时间戳完全由创建提交的机器上的系统时钟决定,该时钟可以轻微回滚。因此,唯一可靠的方法是在截止日期前克隆/提取所有存储库。
然后,有一个问题,你想检查哪个分支。您想只考虑master分支吗?如果是这样的话,最好在运行master时指定git log分支。请记住,如果考虑到在截止日期之前创建的所有提交,您可能最终会接受学生回滚的提交。换句话说,如果学生提交,然后改变主意(使用git reset --hard HEAD^),您可能会误解被丢弃的版本为提交,仅仅因为它有一个稍后的时间戳。出于这个原因,我希望您只检查沿着商定的分支或标记执行的提交,而不是存储库中可能存在的所有内容。
在get_commit_hash()中,使用%ad漂亮打印格式获得commit_timestamp。这是一个错误的名称,因为%ad获得的是作者时间戳,而不是提交时间戳。我相信您应该对提交时间戳更感兴趣。(随着提交链的进展,作者时间甚至不一定是单调的,因为提交可以使用git rebase重新排列。)
假设你仍然想完成原来的计划,那你就太努力了。这将使您了解2013年提交日期为master分支的最新提交的散列:
git log -n 1 --until='2013-12-31 23:59:59' --pretty=%H master更好的做法是,阅读gitrevisions(1)对“裁判员在某一时间点的价值”的看法,跳过所有这些分析。
git checkout 'master@{2013-12-31 23:59:59}'顺便说一句,我强烈建议您放弃日期格式,改用ISO 8601格式。

https://codereview.stackexchange.com/questions/41492
复制相似问题