Bug #4561

Floating-point inaccuracies break CSVFormatter functionality

Added by Tuukka Lehtonen about 4 years ago. Updated about 4 years ago.

Status:ClosedStart date:
Priority:4Due date:
Assignee:Tuukka Lehtonen% Done:

100%

Category:HistorySpent time:-
Target version:1.13.1
Release notes:
Story pointsS
Velocity based estimate-
ReleaseSimantics 1.13.2Release relationshipAuto

Description

CSVFormatter history data sampling currently happens at inaccurate moments in time due to floating-point calculation inaccuracies.

Here's a snapshot of a sampling of data that has samples at 0.2 second intervals. The fist number shows how CSVFormatter chooses the sampling times when the sampling is set to 0.2 second fixed time step.

SAMPLING TIME: 0.0, SAMPLED: 0.0 = 0.7199996113777161
SAMPLING TIME: 0.2, SAMPLED: 0.2 = 0.7066662907600403
SAMPLING TIME: 0.4, SAMPLED: 0.4 = 0.6933329701423645
SAMPLING TIME: 0.6000000000000001, SAMPLED: 0.6 = 0.6799996495246887
SAMPLING TIME: 0.8, SAMPLED: 0.8 = 0.6666663289070129
SAMPLING TIME: 1.0, SAMPLED: 1.0 = 0.6533329486846924
SAMPLING TIME: 1.2, SAMPLED: 1.2 = 0.6399996280670166
SAMPLING TIME: 1.4, SAMPLED: 1.4 = 0.6266663074493408
SAMPLING TIME: 1.5999999999999999, SAMPLED: 1.4 = 0.6266663074493408
SAMPLING TIME: 1.7999999999999998, SAMPLED: 1.6 = 0.613332986831665
SAMPLING TIME: 1.9999999999999998, SAMPLED: 1.8 = 0.5999996066093445
SAMPLING TIME: 2.1999999999999997, SAMPLED: 2.0 = 0.5866662859916687
...

We can see that sampling time calculation in fixed-time-step mode is inaccurate. This causes the code to read the wrong sample compared to what would be expected. This makes the export CSV data look like it's behind of what was intended to be sampled. The time stepping code currently works as follows:
double time = from;
do {
    // write time
    // write item values
    // next line
    time += timeStep;
} while (time < end);

Such iterative summation invariably causes inaccuracies in the value of time. Calculating time = n*timeStep would be more accurate but it's still inprecise and causes the same kinds of sampling problems. The only way to stabilize the fixed time step calculation is to use BigDecimal.

However, this is not the only source of inaccuracies in CSV export. Depending on the time stamps produced by the simulator where the history data comes from, inaccuracies may also arise from there. There are obviously cases where CSV export samples the stored history data at points in time that are nowhere near a true sample in the time space. The only solution to this is to allow specification of how to interpolate samples and then perform the interpolation. Plain linear interpolation is pretty much the only useful interpolation at this point.

Associated revisions

Revision 28366
Added by Tuukka Lehtonen about 4 years ago

Added new NumberFormat implementation: FormattingUtils.significantDigitFormat(int precision).

refs #4561

Revision 28367
Added by Tuukka Lehtonen about 4 years ago

Fixed typo in TrendNode comment.
refs #4561

Revision 28368
Added by Tuukka Lehtonen about 4 years ago

Fixed chart editor value tip box to show either < or > character in front of the rounded time value depending on whether the actual floating point is really less than or greater than the rounded value.

This tries to convey the user the idea that the actual time of sampling is not exactly accurate with the time value shown in the value tip hover box.

refs #4561

Revision 28369
Added by Tuukka Lehtonen about 4 years ago

Testing BigDecimal formatting.
refs #4561

Revision 28370
Added by Tuukka Lehtonen about 4 years ago

Fixed DatasourceAdapter.list's invalid invocations to Collector.setValue.
refs #4561

Revision 28371
Added by Tuukka Lehtonen about 4 years ago

  • Replaced the previous CSV export number format preference with significant digit count preferences. It doesn't really matter in which particular format numbers are exported in, as long as they have the right amount of accuracy.
  • Fixed CSVFormatter fixed-time-step stepping to use BigDecimal for calculating the current time. Still missing interpolation for samples.

refs #4561

Revision 28376
Added by Tuukka Lehtonen about 4 years ago

BigDecimal-based linear interpolation sampling for CSVFormatter.
Interpolation only applies to numeric float/double values, not integer-types or booleans.
Integers and booleans are still previous value -sampled.

refs #4561

History

#1 Updated by Tuukka Lehtonen about 4 years ago

  • Subject changed from DataSourceAdapter does not work with anything else than GraphHandles to Floating-point inaccuracies break CSVFormatter functionality
  • Description updated (diff)
  • Assignee set to Tuukka Lehtonen

#2 Updated by Tuukka Lehtonen about 4 years ago

  • Category set to History
  • Status changed from New to In Progress

#3 Updated by Tuukka Lehtonen about 4 years ago

  • % Done changed from 0 to 70

#4 Updated by Tuukka Lehtonen about 4 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 70 to 100

Linear interpolation was implemented in r28376. Marking resolved.

#5 Updated by Tuukka Lehtonen about 4 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF