首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >用于ping数千个urls的Java程序

用于ping数千个urls的Java程序
EN

Stack Overflow用户
提问于 2019-10-08 16:59:00
回答 3查看 363关注 0票数 0

我想写一个程序来ping至少10000个网址。我写了一个小程序,发现它没有我预期的那么快。

Pinging 100个urls需要3-4分钟。

有没有人有更好的建议呢?

代码语言:javascript
复制
private static Map<String, String> findUnreachableUrls(Set<String> urls) {
        Map<String, String> badUrls = new TreeMap<>();
        for (String url : urls) {
            HttpURLConnection connection;
            try {
                connection = (HttpURLConnection) new URL(url).openConnection();
                connection.setRequestMethod("HEAD");
                connection.connect();
                int responseCode = connection.getResponseCode();
                if (responseCode != 200 && responseCode != 302) {
                    badUrls.put(url, Integer.toString(responseCode));
                }
            } catch (IOException e) {
                badUrls.put(url, e.getMessage());
            }

        }
        return badUrls;
    }
EN

回答 3

Stack Overflow用户

回答已采纳

发布于 2019-10-08 20:56:53

正如我所写的,我会这样做(未测试)。也许如果只有几个主机和大量的URL,则应该进一步拆分成组的URL。

代码语言:javascript
复制
 import java.io.IOException;
 import java.net.HttpURLConnection;
 import java.net.URL;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.Map;
 import java.util.Optional;
 import java.util.function.Function;
 import java.util.stream.Collectors;
 import java.util.stream.Stream;

 public class TestURLs implements Function<String, Optional<TestURLs.Tuple>> {

public static final int TIMEOUT = 3000;

public class Tuple {
    final String url;
    final String error;

    public Tuple(String url, String error) {
        this.url = url;
        this.error = error;
    }
}

public static enum HostNamePortExtractor implements Function<String, String>{

    INSTANCE;

    @Override
    public String apply(String url) {
        try {
            URL u = new URL(url);
            return u.getHost() + u.getPort();
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
}

@Override
public Optional<Tuple> apply(String url) {
    HttpURLConnection connection;
    try {
        connection = (HttpURLConnection) new URL(url).openConnection();
        connection.setRequestMethod("HEAD");
        connection.setReadTimeout(TIMEOUT);
        connection.setConnectTimeout(TIMEOUT);
        connection.connect();
        int responseCode = connection.getResponseCode();
        // are you sure? I think you would have liked to write here "and" not or
        //if (responseCode != 200 || responseCode != 302) {
        if (responseCode != 200 && responseCode != 302) {
            return Optional.of(new Tuple(url, Integer.toString(responseCode)));
        }
    } catch (IOException e) {
        return Optional.of(new Tuple(url, e.getMessage()));
    }
    return Optional.empty();
}

public Map<String, String> process() {
        List<String> URLs = new ArrayList<>(); // add urls here
        // group by hostname+port
        Map<String, List<String>> groupedUrls = URLs.stream().collect(Collectors.groupingBy(HostNamePortExtractor.INSTANCE));
        Stream<Tuple> errors = groupedUrls.keySet().parallelStream()
                // I am not fully sure, but hoping that the stream() will go to the same thread
                .flatMap(host -> groupedUrls.get(host).stream())
                // go to the server
                .map(this::apply)
                // if there was no error, filter out the optional.empties
                .filter(o -> o.isPresent())
                // get the Tuple with url and the error
                .map(o -> o.get()); 
                // make a map
        return errors.collect(Collectors.toMap(t -> t.url, t -> t.error));
    }

public static void main(String[] args) {
    TestURLs testUrls = new TestURLs();
    testUrls.process().entrySet().forEach(e -> {
        System.out.println(e.getKey() + " error: " + e.getValue());
    });
}

}

票数 1
EN

Stack Overflow用户

发布于 2019-10-08 17:04:12

你应该使用并行线程,比如5个线程,对20个URL执行相同的过程,最后聚合结果。这将使结果更快。最简单的解决方案是使用Java 8 Streams并行处理URL。下面是一个相同的示例程序:

代码语言:javascript
复制
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.HashSet;
import java.util.Map;
import java.util.Set;
import java.util.TreeMap;

public class Main {

    public static void main(String[] args) {
        Set<String> urlSet = new HashSet<>();
        //Populate your set with the Strings
        findUnreachableUrls(urlSet);
    }
    private static Map<String, String> findUnreachableUrls(Set<String> urls) {
        Map<String, String> badUrls = new TreeMap<>();
        urls.parallelStream().forEach(
                url->{
                    badUrls.put(url,checkUrl(url));
                }
        );
        return badUrls;
    }

    private static String checkUrl(String url)
    {
        HttpURLConnection connection;
        String returnCode="";
        try {
            connection = (HttpURLConnection) new URL(url).openConnection();
            connection.setRequestMethod("HEAD");
            connection.connect();
            int responseCode = connection.getResponseCode();
            if (responseCode != 200 || responseCode != 302) {
                returnCode=responseCode+"";
            }
        }catch(IOException e)
        {
            returnCode=e.getMessage();
        }
        return returnCode;
    }

}
票数 2
EN

Stack Overflow用户

发布于 2019-10-08 21:36:10

为了完整起见,这是我实现的方式。

代码语言:javascript
复制
 private static Map<String, String> findUnreachableUrls1(Set<String> urls) {

            Predicate<String> unPingableUrlPred = x -> !(x.equals("200") || x.equals("302"));
            Map<String, String> badUrls = urls.parallelStream().map(url -> pingUrl(url))
                    .filter(x -> unPingableUrlPred.test(x.t)).collect(Collectors.toConcurrentMap(x -> x.s, x -> x.t));
                return badUrls;
        }

        static Pair<String, String> pingUrl(String url) {
            Pair<String, String> urlResponse = new Pair<>();
            urlResponse.setKey(url);
            HttpURLConnection connection;
            try {
                connection = (HttpURLConnection) new URL(url).openConnection();
                connection.setConnectTimeout(5000);
                connection.setReadTimeout(5000);
                connection.setRequestMethod("HEAD");
                connection.connect();
                int responseCode = connection.getResponseCode();
                urlResponse.setValue(Integer.toString(responseCode));
            } catch (IOException e) {
                urlResponse.setValue(e.getMessage());
            }
            return urlResponse;
        }
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/58283143

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档