文章/答案/技术大牛

发布

社区首页 >问答首页 >Java伪尾调用递归获得更好的性能

问Java伪尾调用递归获得更好的性能
EN

Stack Overflow用户

提问于 2016-02-16 03:50:23

回答 1查看 58关注 0票数 0

我只是写了一些简单的实用程序来计算linkedList的长度，这样linkedList就不会承载一个大小/长度的“内部”计数器。有了这个想法，我有了三个简单的方法：

遍历linkedList，直到到达“结束”为止
递归计算长度
递归计算长度，而不需要将控制返回到调用函数(使用某种形式的尾调用递归)

下面是一些捕捉这3种情况的代码：

// 1. iterative approach
public static <T> int getLengthIteratively(LinkedList<T> ll) {

    int length = 0;
    for (Node<T> ptr = ll.getHead(); ptr != null; ptr = ptr.getNext()) {
        length++;
    }

    return length;
}

// 2. recursive approach
public static <T> int getLengthRecursively(LinkedList<T> ll) {
    return getLengthRecursively(ll.getHead());
}

private static <T> int getLengthRecursively(Node<T> ptr) {

    if (ptr == null) {
        return 0;
    } else {
        return 1 + getLengthRecursively(ptr.getNext());
    }
}

// 3. Pseudo tail-recursive approach
public static <T> int getLengthWithFakeTailRecursion(LinkedList<T> ll) {
    return getLengthWithFakeTailRecursion(ll.getHead());
}

private static <T> int getLengthWithFakeTailRecursion(Node<T> ptr) {
    return getLengthWithFakeTailRecursion(ptr, 0);
}

private static <T> int getLengthWithFakeTailRecursion(Node<T> ptr, int result) {
    if (ptr == null) {
        return result;
    } else {
        return getLengthWithFakeTailRecursion(ptr.getNext(), result + 1);
    }
}

现在我知道JVM不支持尾递归，但是当我运行一些简单的测试时，当我使用~10k节点对字符串进行linkedLists测试时，我注意到getLengthWithFakeTailRecursion的性能始终优于getLengthRecursively方法(比getLengthRecursively方法高出40%)。增量是否只能归因于这样一个事实:对于case#2，每个节点的控制都被传回，而我们被迫穿越所有堆栈帧？

编辑:下面是我用来检查性能编号的简单测试：

public class LengthCheckerTest {

@Test
public void testLengthChecking() {

    LinkedList<String> ll = new LinkedList<String>();
    int sizeOfList = 12000;
    // int sizeOfList = 100000; // Danger: This causes a stackOverflow in recursive methods!
    for (int i = 1; i <= sizeOfList; i++) {
        ll.addNode(String.valueOf(i));
    }

    long currTime = System.nanoTime();
    Assert.assertEquals(sizeOfList, LengthChecker.getLengthIteratively(ll));
    long totalTime = System.nanoTime() - currTime;
    System.out.println("totalTime taken with iterative approach: " + (totalTime / 1000) + "ms");

    currTime = System.nanoTime();
    Assert.assertEquals(sizeOfList, LengthChecker.getLengthRecursively(ll));
    totalTime = System.nanoTime() - currTime;
    System.out.println("totalTime taken with recursive approach: " + (totalTime / 1000) + "ms");

    // Interestingly, the fakeTailRecursion always runs faster than the vanillaRecursion
    // TODO: Look into whether stack-frame collapsing has anything to do with this
    currTime = System.nanoTime();
    Assert.assertEquals(sizeOfList, LengthChecker.getLengthWithFakeTailRecursion(ll));
    totalTime = System.nanoTime() - currTime;
    System.out.println("totalTime taken with fake TCR approach: " + (totalTime / 1000) + "ms");
}
}

performance

data-structures

jvm

java

回答 1

Stack Overflow用户

回答已采纳

发布于 2016-02-16 04:16:16

你的基准方法是有缺陷的。您可以在同一个JVM中执行所有三个测试，因此它们的位置并不相同。当执行假尾测试时，LinkedList和Node类已经进行了JIT编译，因此工作速度更快。您可以更改测试的顺序，您将看到不同的数字。每个测试都应该在单独的JVM中执行。

让我们为您的情况编写简单的JMH微基准：

import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.annotations.*;

// 5 warm-up iterations, 500 ms each, then 10 measurement iterations 500 ms each
// repeat everything three times (with JVM restart)
// output average time in microseconds
@Warmup(iterations = 5, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(3)
@State(Scope.Benchmark)
public class ListTest {
    // You did not supply Node and LinkedList implementation
    // but I assume they look like this
    static class Node<T> {
        final T value;
        Node<T> next;

        public Node(T val) {value = val;}
        public void add(Node<T> n) {next = n;}

        public Node<T> getNext() {return next;}
    }

    static class LinkedList<T> {
        Node<T> head;

        public void setHead(Node<T> h) {head = h;}
        public Node<T> getHead() {return head;}
    }

    // Code from your question follows

    // 1. iterative approach
    public static <T> int getLengthIteratively(LinkedList<T> ll) {

        int length = 0;
        for (Node<T> ptr = ll.getHead(); ptr != null; ptr = ptr.getNext()) {
            length++;
        }

        return length;
    }

    // 2. recursive approach
    public static <T> int getLengthRecursively(LinkedList<T> ll) {
        return getLengthRecursively(ll.getHead());
    }

    private static <T> int getLengthRecursively(Node<T> ptr) {

        if (ptr == null) {
            return 0;
        } else {
            return 1 + getLengthRecursively(ptr.getNext());
        }
    }

    // 3. Pseudo tail-recursive approach
    public static <T> int getLengthWithFakeTailRecursion(LinkedList<T> ll) {
        return getLengthWithFakeTailRecursion(ll.getHead());
    }

    private static <T> int getLengthWithFakeTailRecursion(Node<T> ptr) {
        return getLengthWithFakeTailRecursion(ptr, 0);
    }

    private static <T> int getLengthWithFakeTailRecursion(Node<T> ptr, int result) {
        if (ptr == null) {
            return result;
        } else {
            return getLengthWithFakeTailRecursion(ptr.getNext(), result + 1);
        }
    }

    // Benchmarking code

    // Measure for different list length        
    @Param({"10", "100", "1000", "10000"})
    int n;
    LinkedList<Integer> list;

    @Setup    
    public void setup() {
        list = new LinkedList<>();
        Node<Integer> cur = new Node<>(0);
        list.setHead(cur);
        for(int i=1; i<n; i++) {
            Node<Integer> next = new Node<>(i);
            cur.add(next);
            cur = next;
        }
    }

    // Do not forget to return result to the caller, so it's not optimized out
    @Benchmark    
    public int testIteratively() {
        return getLengthIteratively(list);
    }

    @Benchmark    
    public int testRecursively() {
        return getLengthRecursively(list);
    }

    @Benchmark    
    public int testRecursivelyFakeTail() {
        return getLengthWithFakeTailRecursion(list);
    }
}

这是我的机器上的结果(x64 Win7，Java8u71)

Benchmark                           (n)  Mode  Cnt   Score    Error  Units
ListTest.testIteratively             10  avgt   30   0,009 ±  0,001  us/op
ListTest.testIteratively            100  avgt   30   0,156 ±  0,001  us/op
ListTest.testIteratively           1000  avgt   30   2,248 ±  0,036  us/op
ListTest.testIteratively          10000  avgt   30  26,416 ±  0,590  us/op
ListTest.testRecursively             10  avgt   30   0,014 ±  0,001  us/op
ListTest.testRecursively            100  avgt   30   0,191 ±  0,003  us/op
ListTest.testRecursively           1000  avgt   30   3,599 ±  0,031  us/op
ListTest.testRecursively          10000  avgt   30  40,071 ±  0,328  us/op
ListTest.testRecursivelyFakeTail     10  avgt   30   0,015 ±  0,001  us/op
ListTest.testRecursivelyFakeTail    100  avgt   30   0,190 ±  0,002  us/op
ListTest.testRecursivelyFakeTail   1000  avgt   30   3,609 ±  0,044  us/op
ListTest.testRecursivelyFakeTail  10000  avgt   30  41,534 ±  1,186  us/op

如您所见，假尾速度与简单递归速度相同(在误差范围内)，比迭代方法慢20-60%。所以你的结果不会被复制。

如果您实际上不想获得稳态测量结果，而是单次测试结果(没有热身)，您可以使用以下选项启动相同的基准测试：-ss -wi 0 -i 1 -f 10。结果如下：

Benchmark                           (n)  Mode  Cnt    Score     Error  Units
ListTest.testIteratively             10    ss   10   16,095 ±   0,831  us/op
ListTest.testIteratively            100    ss   10   19,780 ±   6,440  us/op
ListTest.testIteratively           1000    ss   10   74,316 ±  26,434  us/op
ListTest.testIteratively          10000    ss   10  366,496 ±  42,299  us/op
ListTest.testRecursively             10    ss   10   19,594 ±   7,084  us/op
ListTest.testRecursively            100    ss   10   21,973 ±   0,701  us/op
ListTest.testRecursively           1000    ss   10  165,007 ±  54,915  us/op
ListTest.testRecursively          10000    ss   10  563,739 ±  74,908  us/op
ListTest.testRecursivelyFakeTail     10    ss   10   19,454 ±   4,523  us/op
ListTest.testRecursivelyFakeTail    100    ss   10   25,518 ±  11,802  us/op
ListTest.testRecursivelyFakeTail   1000    ss   10  158,336 ±  43,646  us/op
ListTest.testRecursivelyFakeTail  10000    ss   10  755,384 ± 232,940  us/op

因此，正如你所看到的，第一次发射比随后的发射慢很多倍。你的结果还没有被复制。我观察到，testRecursivelyFakeTail对n = 10000的速度实际上更慢(但在热身之后，它达到了与testRecursively相同的峰值速度。

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/35423375

复制

相似问题

问Java伪尾调用递归获得更好的性能
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Java伪尾调用递归获得更好的性能EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问Java伪尾调用递归获得更好的性能
EN