我对Guava的拆分可能性很感兴趣:
Splitter.on("|").split("foo|bar|baz");
// => "foo", "bar", "baz"这可以正常工作。
现在,如果我想在"|“上拆分,而不是在"”和“”之间拆分:
Splitter.on(something).split("foo|ba[r|ba]z");
// => "foo", "ba[r|ba]z"据我所知,在Guava中不可能定义这个“东西”。
我找到了这个:Issue 799: Add google escape library to Guava。这是相关的吗?
发布于 2012-05-27 22:28:20
处理这个问题的正确方法是创建一个解析器。现在它真的很简单,只需要使用一个解析器组合子,比如JParsec。你会得到类似这样的东西:
class ParserFactory {
Parser escapedSequence() {
return Parsers.between(Scanners.string("["),
Scanners.anyCharacterButNot("]"), Scanners.string("]"));
}
Parser chunk() {
return Parsers.or(escapedSequence(), Scanners.anyCharacterButNot("|"));
}
Parsers wholeThing() {
return Parsers.separatedBy(chunk().plus(), Scanners.string("|"));
}
}发布于 2012-05-27 22:11:53
以下是适用于给定用例的代码(使用现有的拆分器代码作为参考)
public class Splitter {
private final CharMatcher trimmer;
private final CharMatcher startTextQualifier;
private final CharMatcher endTextQualifier;
private final Strategy strategy;
private Splitter(Strategy strategy, CharMatcher trimmer, CharMatcher startTextQualifier, CharMatcher endTextQualifier) {
this.strategy = strategy;
this.trimmer = trimmer;
this.startTextQualifier = startTextQualifier;
this.endTextQualifier = endTextQualifier;
}
private Splitter(Strategy strategy) {
this(strategy, CharMatcher.NONE, CharMatcher.NONE, CharMatcher.NONE);
}
public Splitter trimResults(CharMatcher trimmer) {
checkNotNull(trimmer);
return new Splitter(strategy, trimmer, startTextQualifier, endTextQualifier);
}
public Splitter ignoreIn(CharMatcher startTextQualifier, CharMatcher endTextQualifier) {
checkNotNull(startTextQualifier);
checkNotNull(endTextQualifier);
return new Splitter(strategy, trimmer, startTextQualifier, endTextQualifier);
}
public Splitter ignoreIn(char startTextQualifier, char endTextQualifier) {
return ignoreIn(CharMatcher.is(startTextQualifier), CharMatcher.is(endTextQualifier));
}
public Splitter trimResults() {
return trimResults(CharMatcher.WHITESPACE);
}
public static Splitter on(final CharMatcher separatorMatcher) {
checkNotNull(separatorMatcher);
return new Splitter(new Strategy() {
@Override public SplittingIterator iterator(Splitter splitter, final CharSequence toSplit) {
return new SplittingIterator(splitter, toSplit) {
@Override int separatorStart(int start) {
boolean wrapped = false;
for (int i = start; i < toSplit.length(); i++) {
/**
* Suppose start text qualifier = '[' and end text qualifier = ']' then following code
* doesn't address cases for multiple start-end combinations i.e it doesn't see whether
* end is properly closed e.g. for configuration like - {@code
* Splitter.on("|")..ignoreIn('[', ']').split("abc|[abc|[def]ghi]|jkl")
* results -> abc, [abc|[def]ghi], jkl
}
*/
if (!wrapped && startTextQualifier.matches(toSplit.charAt(i))) {
wrapped = true;
} else if (wrapped && endTextQualifier.matches(toSplit.charAt(i))) {
wrapped = false;
}
if (!wrapped && separatorMatcher.matches(toSplit.charAt(i))) {
return i;
}
}
return -1;
}
@Override int separatorEnd(int separatorPosition) {
return separatorPosition + 1;
}
};
}
});
}
public static Splitter on(final String separator) {
checkArgument(!separator.isEmpty(), "The separator may not be the empty string.");
checkArgument(separator.length() <= 2, "The separator's max length is 2, passed - %s.", separator);
if (separator.length() == 1) {
return on(separator.charAt(0));
}
return new Splitter(new Strategy() {
@Override public SplittingIterator iterator(Splitter splitter, CharSequence toSplit) {
return new SplittingIterator(splitter, toSplit) {
@Override public int separatorStart(int start) {
int delimiterLength = separator.length();
boolean wrapped = false;
positions:
for (int p = start, last = toSplit.length() - delimiterLength; p <= last; p++) {
for (int i = 0; i < delimiterLength; i++) {
if (startTextQualifier.matches(toSplit.charAt(i))) {
wrapped = !wrapped;
}
if (!wrapped && toSplit.charAt(i + p) != separator.charAt(i)) {
continue positions;
}
}
return p;
}
return -1;
}
@Override public int separatorEnd(int separatorPosition) {
return separatorPosition + separator.length();
}
};
}
});
}
public static Splitter on(char separator) {
return on(CharMatcher.is(separator));
}
public Iterable<String> split(final CharSequence sequence) {
checkNotNull(sequence);
return new Iterable<String>() {
@Override public Iterator<String> iterator() {
return spliterator(sequence);
}
};
}
private Iterator<String> spliterator(CharSequence sequence) {
return strategy.iterator(this, sequence);
}
private interface Strategy {
Iterator<String> iterator(Splitter splitter, CharSequence toSplit);
}
private abstract static class SplittingIterator extends AbstractIterator<String> {
final CharSequence toSplit;
final CharMatcher trimmer;
final CharMatcher startTextQualifier;
final CharMatcher endTextQualifier;
/**
* Returns the first index in {@code toSplit} at or after {@code start}
* that contains the separator.
*/
abstract int separatorStart(int start);
/**
* Returns the first index in {@code toSplit} after {@code
* separatorPosition} that does not contain a separator. This method is only
* invoked after a call to {@code separatorStart}.
*/
abstract int separatorEnd(int separatorPosition);
int offset = 0;
protected SplittingIterator(Splitter splitter, CharSequence toSplit) {
this.trimmer = splitter.trimmer;
this.startTextQualifier = splitter.startTextQualifier;
this.endTextQualifier = splitter.endTextQualifier;
this.toSplit = toSplit;
}
@Override
protected String computeNext() {
if (offset != -1) {
int start = offset;
int separatorPosition = separatorStart(offset);
int end = calculateEnd(separatorPosition);
start = trimStartIfRequired(start, end);
end = trimEndIfRequired(start, end);
if (start != end)
return toSplit.subSequence(start, end).toString();
}
return endOfData();
}
private int calculateEnd(int separatorPosition) {
int end;
if (separatorPosition == -1) {
end = toSplit.length();
offset = -1;
} else {
end = separatorPosition;
offset = separatorEnd(separatorPosition);
}
return end;
}
private int trimEndIfRequired(int start, int end) {
while (end > start && trimmer.matches(toSplit.charAt(end - 1))) {
end--;
}
return end;
}
private int trimStartIfRequired(int start, int end) {
while (start < end && trimmer.matches(toSplit.charAt(start))) {
start++;
}
return start;
}
}}
小测试-
public static void main(String[] args) {
Splitter splitter = Splitter.on("|").ignoreIn('[', ']');
System.out.println(Joiner.on(',').join(splitter.split("foo|ba[r|ba]z")));
// yields -> foo,ba[r|ba]z
}请注意-此代码没有经过测试,也没有解决所有情况,请随时根据您的需要进行修改。
发布于 2012-05-26 04:38:25
芭乐拆分器非常强大,它可以处理正则表达式分隔符,它可以拆分成地图和更多。但是,您尝试实现的功能确实超出了任何泛型解析器的范围。
你想要一个有开/关开关的拆分器。我认为唯一的方法是手动完成,就像这样:
List<String> ls=new ArrayList<String>();
int b=0;
int j=0;
String str="foo|ba[r|ba]z";
int e=str.indexOf('|');
do{
if(b>j)
{
j=str.indexOf('[',j);
while(j>0 && e>=j){
j=str.indexOf(']',j);
if (j<0){
ls.add(str.substring(b));
return ;
}
j=str.indexOf('[',j);
}
}
ls.add(str.substring(b,e));
System.out.println(str.substring(b,e));
b=++e;
e=str.indexOf('|',e);
} while( e >= 0);(免责声明:这段代码只是给出一个想法,它不起作用)
https://stackoverflow.com/questions/10755365
复制相似问题