Skip to content

Float output precision issue - failed test for abc315_f problem despite correct solution #129

@filip-chodziutko

Description

@filip-chodziutko

I have recently evaluated llm generated solution for abc315_f problem using lcb_runner.runner.custom_evaluator.
Using the following file as input:

[
    {
        "question_id": "abc315_f",
        "code_list": [
            'import math\n\ndef main():\n    import sys\n    input = sys.stdin.read().split()\n    idx = 0\n    N = int(input[idx])\n    idx += 1\n    points = []\n    for _ in range(N):\n        x = int(input[idx])\n        y = int(input[idx+1])\n        points.append((x, y))\n        idx += 2\n    \n    INF = float(\'inf\')\n    max_c = 20  # Maximum number of skips considered\n    dp = [[INF] * (max_c + 1) for _ in range(N)]\n    dp[0][0] = 0.0  # Starting at checkpoint 0 with 0 skips\n\n    for i in range(1, N):\n        start_j = max(0, i - (max_c + 1))\n        for j in range(start_j, i):\n            for c_prev in range(0, max_c + 1):\n                if dp[j][c_prev] == INF:\n                    continue\n                skips = i - j - 1\n                new_c = c_prev + skips\n                if new_c > max_c:\n                    continue\n                x1, y1 = points[j]\n                x2, y2 = points[i]\n                dist = math.hypot(x2 - x1, y2 - y1)\n                if dp[i][new_c] > dp[j][c_prev] + dist:\n                    dp[i][new_c] = dp[j][c_prev] + dist\n\n    min_total = INF\n    for c in range(0, max_c + 1):\n        if dp[N-1][c] == INF:\n            continue\n        total = dp[N-1][c]\n        if c == 0:\n            total += 0.0\n        else:\n            total += (2 ** (c - 1))\n        if total < min_total:\n            min_total = total\n\n    print("{0:.20f}".format(min_total))\n\nif __name__ == "__main__":\n    main()'
        ]
    }
]

I have noticed that the first public test fails with the following error_message:

Wrong answer at output_line_idx=0: 5.82842712474618984686 != 5.82842712474619009753

However the question_content states:

Your output is considered correct if the absolute or relative error from the true value is at most 10^{-5}.

It looks like the expected answer is of such high precision that the build-in python's float that the solution uses is not enough to exactly match it (as I understood from the codebase the decimal.Decimal is used for performing this exact match).
I run an additional simple test:

import numpy as np
print(
    float("5.82842712474618984686") == float("5.82842712474619009753"),
    np.float64("5.82842712474618984686") == np.float64("5.82842712474619009753"),
)
# Outputs: True True

which confirmed that the solution is correct even when we ignore the statement from the question and use 64-bit float for evaluation.

Most straightforward solution seems to be lowering the precision of the expected output for abc315_f problem.
It is probably worth checking if other problems does not face the same issue (I have evaluated on a small subset and found only this one).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions