首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用未指定的分隔符扫描数据并向数组中添加数据

用未指定的分隔符扫描数据并向数组中添加数据
EN

Stack Overflow用户
提问于 2016-07-10 04:12:39
回答 2查看 60关注 0票数 1

我有一个任务,要求我在txt文件中处理下面的数据。没有指定的分隔符可以使我更容易对数组列表进行排序。我可以使用Scanner类读取文本文件并将其排序到一个数组中,如下所示:

代码语言:javascript
复制
for (int rows; rows < array.length; rows++){
array[rows][0] = fileIn.next();
array[rows][1] = fileIn.next();

等等..。但是,名称更难一些,因为它们中有不同数量的空格,并且可能有不同数量的名称。我想把"Allison,Hudson ()“这样的全名作为自己的元素。我不太清楚从哪里开始,但我认为一个解决方案是让程序检查是否存在“男性”--“女性”,这样我们就可以启动一个新元素。任何帮助都将不胜感激。

代码语言:javascript
复制
1   1   Allen, Miss. Elisabeth Walton   female  29  211.3375
1   1   Allison, Master. Hudson Trevor  male    0.9167  151.5500
1   0   Allison, Miss. Helen Loraine    female  2   151.5500
1   0   Allison, Mr. Hudson Joshua Creighton    male    30  151.5500
1   0   Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female  25  151.5500
1   1   Anderson, Mr. Harry male    48  26.5500
1   1   Andrews, Miss. Kornelia Theodosia   female  63  77.9583
1   0   Andrews, Mr. Thomas Jr  male    39  0.0000
1   1   Appleton, Mrs. Edward Dale (Charlotte Lamson)   female  53  51.4792
1   0   Artagaveytia, Mr. Ramon male    71  49.5042
1   0   Astor, Col. John Jacob  male    47  227.5250
1   1   Astor, Mrs. John Jacob (Madeleine Talmadge Force)   female  18  227.5250
1   1   Aubart, Mme. Leontine Pauline   female  24  69.3000
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-07-10 04:39:31

这是一个很好的正则表达式--参见这里中的一个数据示例。

代码语言:javascript
复制
([\d]) +([\d]) +(.+\S) +(female|male) +([\d.]+)  +([\d.]+)

这里在repl.it上的完整示例

代码语言:javascript
复制
import java.util.regex.Matcher;
import java.util.regex.Pattern;

class Main {
    public static void main( String args[] ){
        String text = 
            "1   1   Allen, Miss. Elisabeth Walton   female  29  211.3375\n"+
            "1   1   Allison, Master. Hudson Trevor  male    0.9167  151.5500\n"+
            "1   0   Allison, Miss. Helen Loraine    female  2   151.5500\n"+
            "1   0   Allison, Mr. Hudson Joshua Creighton    male    30  151.5500\n"+
            "1   0   Allison, Mrs. Hudson J C (Bessie Waldo Daniels) female  25  151.5500\n"+
            "1   1   Anderson, Mr. Harry male    48  26.5500\n"+
            "1   1   Andrews, Miss. Kornelia Theodosia   female  63  77.9583\n"+
            "1   0   Andrews, Mr. Thomas Jr  male    39  0.0000\n"+
            "1   1   Appleton, Mrs. Edward Dale (Charlotte Lamson)   female  53  51.4792\n"+
            "1   0   Artagaveytia, Mr. Ramon male    71  49.5042\n"+
            "1   0   Astor, Col. John Jacob  male    47  227.5250\n"+
            "1   1   Astor, Mrs. John Jacob (Madeleine Talmadge Force)   female  18  227.5250\n"+
            "1   1   Aubart, Mme. Leontine Pauline   female  24  69.3000\n";

        String lines[] = text.split("\\r?\\n");

        String pattern = "([\\d]) +([\\d]) +(.+\\S) +(female|male) +([\\d.]+)  +([\\d.]+)";
        Pattern r = Pattern.compile(pattern);

        for (String l : lines) {
            Matcher m = r.matcher(l);
            if (m.find( )) {
                System.out.println(" ------------------- New Text Line -------------------");
                System.out.println("Group 1: " + m.group(1) );
                System.out.println("Group 2: " + m.group(2) );
                System.out.println("Group 3: " + m.group(3) );
                System.out.println("Group 4: " + m.group(4) );
                System.out.println("Group 5: " + m.group(5) );
                System.out.println("Group 6: " + m.group(6) );
            } else {
                System.out.println("Line did not match");
            }   
        }
    }
}

会产生像这样的输出

代码语言:javascript
复制
 ------------------- New Text Line -------------------
Group 1: 1
Group 2: 1
Group 3: Allen, Miss. Elisabeth Walton
Group 4: female
Group 5: 29
Group 6: 211.3375
 ------------------- New Text Line -------------------
Group 1: 1
Group 2: 1
Group 3: Allison, Master. Hudson Trevor
Group 4: male
Group 5: 0.9167
Group 6: 151.5500
 ------------------- New Text Line -------------------
Group 1: 1
Group 2: 0
Group 3: Allison, Miss. Helen Loraine
Group 4: female
Group 5: 2
Group 6: 151.5500
票数 2
EN

Stack Overflow用户

发布于 2016-07-10 05:07:31

我同意你自己的建议。您可以使用正则表达式来帮助解析最初的两个数字和“男性女性”之间的所有内容。

您的代码可能类似于:

代码语言:javascript
复制
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class test {

    private String[] parseLine(String line) {
        String[] output = new String[6];

        Pattern nonWhitespace = Pattern.compile("\\S+");
        Pattern sex = Pattern.compile( "\\s*(male|female)" );

        Matcher m = sex.matcher( line );

        if ( ! m.find() ) {
            // Handle errors. Couldn't find "male" or "female"
        }

        String firstHalf = line.substring(0, m.start());
        String lastHalf = line.substring(m.start(), INPUT.length());

        Matcher firstHalfTokenizer = nonWhitespace.matcher(firstHalf);

        if ( ! firstHalfTokenizer.find() ) {
             // Handle errors. Couldn't find any non-whitespace characters
        }

         output[0] = firstHalf.substring(firstHalfTokenizer.start(), firstHalfTokenizer.end()).trim();

         if ( ! firstHalfTokenizer.find() ) {
             // Handle errors. Couldn't find a second non-whitespace token
         }
         output[1] = firstHalf.substring(firstHalfTokenizer.start(), firstHalfTokenizer.end()).trim();

         output[2] = firstHalf.substring(firstHalfTokenizer.end(), firstHalf.length()).trim();

         Matcher lastHalfTokenizer = nonWhitespace.matcher(lastHalf);

         int index = 3;

         // Need to catch index-out-of-bounds errors if file has too many columns
         while( lastHalfTokenizer.find() ) {
             output[ index ] = lastHalf.substring(lastHalfTokenizer.start(), lastHalfTokenizer.end()).trim();
             index++;
         }

         return output;
      }

      public static void main(String[] args) {
          List<String[]> array = new ArrayList<String[]>();

          for ( String line in file ) { //XXX: Replace this with actual code to loop through the file
              array.add( parseLine(line) );
          }

          // Do whatever you want with it
      }
   }
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/38288503

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档