项目
我想使用JavaScript、HTML和CSS为Java创建一个语法高亮器。它使用正则表达式查找应该高亮显示的部分(目前:关键字、字符串、注释、导入),然后使用HTML标记突出显示找到的部分。
结果
该网站在输入代码之前如下所示:

示例
我使用了以下java-代码段来测试代码:
import java.time.LocalDate;
public class Person {
//Local variable for dateOfBirth
private LocalDate dateOfBirth;
public Person(int year, int month, int day) {
//See API also: https://docs.oracle.com/javase/8/docs/api/java/time/LocalDate.html
dateOfBirth = LocalDate.of(year, month, day);
//Keywords (e.g. int) are not highlighted in comments and strings
System.out.println("Hello (int)");
}
/*
* Getter
*/
public LocalDate getDateOfBirth() {
return dateOfBirth;
}
}结果如下:

背景
这是我的第一个HTML/CSS/JS项目。
代码
var keywordsColor = "#0033cc";
var controlKeywordsColor = "#009933";
var typesKeywordsColor = "#3399ff";
var stringColor = "#ff3300";
var importColor = "#0033cc";
var commentColor = "gray";
var text;
var keywords = ["abstract", "assert", "class", "const", "extends", "false", "final",
"implements", "import", "instanceof", "interface", "native", "new", "null", "package",
"private", "protected", "public", "return", "static", "strictfp", "super", "synchronized",
"System", "this", "throw", "throws", "transient", "true", "volatile"];
var controlKeywords = ["break", "case", "catch", "continue", "default", "do", "else",
"finally", "for", "goto", "if", "switch", "try", "while"];
var typesKeywords = ["boolean", "byte", "char", "double", "enum", "float", "int",
"long", "short", "String", "void"];
var otherKeywords = [];
function highlight() {
text = document.getElementById("Input").value;
highlightKeywords();
highlightStrings();
highlightImports();
highlightSingleLineComments();
highlightMultiLineComments();
addStyles();
document.getElementById("Output").value = text;
document.getElementById("outputArea").innerHTML = text;
}
function highlightKeywords() {
var i;
for (i = 0; i < keywords.length; i++) {
var x = new RegExp(keywords[i] + " ", "g");
var y = "" + keywords[i] + " ";
text = text.replace(x, y);
}
for (i = 0; i < controlKeywords.length; i++) {
var x = new RegExp(controlKeywords[i] + " ", "g");
var y = "" + controlKeywords[i] + " ";
text = text.replace(x, y);
}
for (i = 0; i < typesKeywords.length; i++) {
var x = new RegExp(typesKeywords[i] + " ", "g");
var y = "" + typesKeywords[i]
+ " ";
text = text.replace(x, y);
}
}
function highlightStrings() {
text = text.replace(/"(.*?)"/g,
""
+ "\"$1\"" + "");
}
function highlightImports() {
text = text.replace(/import(.*?);/g,
""
+ "import$1;" + "");
}
function highlightSingleLineComments() {
text = text.replace(/\/\/(.*)/g,
""
+ "//$1" + "");
}
function highlightMultiLineComments() {
text = text.replace(/\/\*([\s\S]*?)\*\//g,
""
+ "/*$1*/" + "");
}
function addStyles() {
text = "\n\n"
+ "#comment span {color:" + commentColor + "!important;}"
+ "#str span {color:" + stringColor + "!important;}" + text
+ "\n\n\n";
}/* Navigation bar style */
.nav ul {
background: ForestGreen; /* Sets the background-color */
list-style: none; /* Removes bullet point */
overflow: hidden; /* What happens when element is too big for formatting context*/
padding: 0px; /* padding-area at all four sides of an element */
}
.nav li {
float: left; /* Move element to the left and add new element on the right side*/
border-right: 2px solid LightGray;/* Border lines on the right side of each element */
}
.nav a {
color: black; /* Font color has to be set here, because otherwise it would be a blue hyperlink */
display: inline-block; /* One box for all elements */
font-size: large; /* Sets font size to a large size */
text-decoration: none; /* Removes underline */
padding: 4px;
}
.nav a:hover {
background: AliceBlue; /* Changes background of element when user is hovering over it */
}
.nav a.active {
background: DarkGreen; /* Changes background of current element */
}
/* Other */
#code {
background: LightGray;
font: monospace;
}
.column {
float: left;
width: 50%;
} Home
HTML syntax-highlighting for Java
Input:
Highlight
Output:
document.getElementById("Input").style.whiteSpace = "nowrap";
document.getElementById("Output").style.whiteSpace = "nowrap";
Preview问题
如何改进这些代码?在HTML/CSS/JS的最佳实践方面,我犯了一个重大错误吗?
如有任何建议,敬请见谅。
后续问题可以找到这里。
发布于 2020-07-22 04:21:37
完全用正则表达式来解释任何源代码语言--也就是说,不实际解析代码并在句法层面上理解它--是出了名的困难。正则表达式确实会成为regexp作为解析器的一些常见问题的牺牲品,因为它将错误地突出显示以下所有内容:
public class Person {
private Account my_import_export;
private Multibyte stupidClassName;
System.out.println("Hi \"friend\".");
}确保你的关键词不要从一个词的中间开始,会有很大帮助,并修复前两个。转义引号这件事更棘手。
发布于 2020-07-21 17:32:47
对于更复杂的示例,您当前突出显示一个又一个标记类型的方法将失败。想象一下:
String s = "public data, private secrets";字符串中的单词不是关键字。
要解决这个问题,您需要更改代码,使输入文本在一次传递中标记化,如以下伪代码所示:
function tokenize(text) {
const tokens = [];
while (text !== '') {
if (text starts with whitespace)
tokens.push(['space', leading space]);
else if (text starts with keyword)
tokens.push(['keyword.flow', keyword]);
else if (text starts with string)
tokens.push(['string', string]);
else
error();
text = text without the current token;
}
return tokens;
}使用这种结构,您可以正确地解析Java代码。解析更复杂的语言,比如Python或Kotlin,甚至Perl,都需要更复杂的解析器,但是Java是一种非常简单的语言(在句法层面上)。
将文本拆分成标记后,从标记中生成突出显示的HTML是非常简单的。
https://codereview.stackexchange.com/questions/245806
复制相似问题