• Gradual Degradation of 3DMark Fire Strike Produces Unreliable Results


    Mr. Fox

     

    For those that are not already aware of the issue, the folks at Futuremark seem to be struggling to keep a consistent product in the latest 3DMark benchmark. In particular, Fire Strike. Sometime around the release of Time Spy things started getting screwy with Fire Strike and now it seems with every Fire Strike GUI version update the effect is progressively decreasing benchmark scores, and specifically the physics portion of the benchmark.

     

    Kudos to @Papusan for noticing this months ago and asking me to have a look at it. He has been going back and forth with Futuremark about the problem and it seems they are either ignoring him or perhaps they do not view it as a high priority issue. Or, maybe because most people running Fire Strike are not observant enough to notice, care, or ask questions they feel they don't need to fix it.

     

    Some people might say you cannot compare results across benchmark software versions, but that shouldn't hold water here. There is a leaderboard and searchable database of results that basically every benching enthusiast and PC reviewer relies on, and if there is not a very high degree of consistency between GUI versions the results in their database will become irrelevant, as will their leaderboard. The search filter does not have a field to filter by GUI version, so we can expect the results from the database and leaderboard to be increasingly misleading, inaccurate and unreliable over time. This certainly is not a desirable thing for what is supposedly the current defacto standard in PC benchmarks.

     

    You will notice from the examples posted below that with each new version of Fire Strike the scores get lower and lower. These examples are consecutive runs on the same day, same machine, and identical CPU and GPU settings. The only thing that changes is Fire Strike benchmark results degrade with newer versions. We need Futuremark to understand and correct this.

     

    http://www.3dmark.com/compare/fs/11047304/fs/11047179/fs/11047154

     

     

    Here is a similar example from @Papusanhttp://www.3dmark.com/compare/fs/11036017/fs/11035883

     

    If you agree this is a problem and want it to be fixed, please complain to Futuremark and let them know they need to put the brakes on and not do anything else with 3DMark until they have this mess under control. Gimmicky features are one thing, but inconsistent benchmark results makes 3DMark unreliable.

     

    If you would like to do your own testing to validate the issue before contacting Futuremark, older versions of 3DMark are available for download from the TechPowerUp.com web site. 

     

    In case you're not good at simple math, here is a visual aid to show what the fuss is about.

     

    Incompetence.jpg

     

    Update 12/13/2016:

    We would like to acknowledge that a representative of Futuremark has responded promptly to this article and provided an email address for those interested in communicating with them about the issue. We appreciate the accountability and responsiveness. 

    Update 12/15/2016:

    We sincerely are grateful for Futuremark's responsiveness. I provided additional test results to Mr. Kokko to corroborate the findings of @Papusan and they have released an update that is expected to resolve the issue. See the message from James below for more details.

    14 hours ago, Futuremark_James said:

    Hello. James from Futuremark here again.


    We've confirmed that there was an issue with the GUI, and we're in the process of rolling out an update (3DMark v2.2.3509) that should fix the scoring discrepancy.

    With this update, overall scores increase slightly by up to 0.3%. Scores from the Physics and CPU parts of benchmark tests may improve by up to 2.5%. These changes bring the scores from 3DMark v2.2.3509 back in line with results from earlier versions that did not have the GUI issue.

     

    For context, it is normal for 3DMark scores to vary by up to 3% between runs since there are some factors in a modern, multitasking operating system that cannot be completely controlled. So again, all credit to @Papusan for noticing the problem and bringing it to us.

     

    To get the update, just open 3DMark and you should get a notification with the option to install it. The Steam version and Steam demo have also been updated.

    On 12/13/2016 at 0:19 PM, Futuremark_James said:

    Hi. James from Futuremark here.

     

    We've been looking into this today, and I'd like to share what we've found.

     

    The Fire Strike workload has not changed at all since 2013. This means that Fire Strike scores should not have changed across app versions either.

     

    We've confirmed that running 3DMark from the command line gives consistent scores across all versions. Unfortunately, it does look like there is an issue when running recent versions from the GUI. We see the same ~2.5% difference in Physics test scores across GUI versions that @Papusan reported to us.

     

    We believe we have found the bug in the GUI, but we need to run some more tests to be sure.

     

    @Mr. Fox, the differences that you are seeing in your results are much larger, and it is not clear why. We would be grateful if you could contact us at info@futuremark.com so we can go through some troubleshooting steps with you.

     

    Thank you, @Papusan, for bringing this to us. I am sorry that we have been slow to respond. I understand how frustrating that is.

     

    I'll post here again when we have more info to share.


    3 people like this


    User Feedback




    On 12/15/2016 at 0:53 PM, Futuremark_James said:

    I asked the team about the CPU power monitoring. It seems there was a concern that it didn't work reliably, or at all, with some hardware.

    We're going to look into it again and see whether that is still the case.

    Thank you.

    Share this comment


    Link to comment
    Share on other sites
    On 15.12.2016 at 8:53 PM, Futuremark_James said:

    Hello. James from Futuremark here again.


    We've confirmed that there was an issue with the GUI, and we're in the process of rolling out an update (3DMark v2.2.3509) that should fix the scoring discrepancy.

    With this update, overall scores increase slightly by up to 0.3%. Scores from the Physics and CPU parts of benchmark tests may improve by up to 2.5%. These changes bring the scores from 3DMark v2.2.3509 back in line with results from earlier versions that did not have the GUI issue.

     

    For context, it is normal for 3DMark scores to vary by up to 3% between runs since there are some factors in a modern, multitasking operating system that cannot be completely controlled. So again, all credit to @Papusan for noticing the problem and bringing it to us.

     

    To get the update, just open 3DMark and you should get a notification with the option to install it. The Steam version and Steam demo have also been updated.

    Feedback from my testing the new 3DM GUI version .... Success

    Downloaded latest UI Version2.2.3509_64.
    My setup is unchanged. Only the updated 3DM GUI is new. Fire Strike physics score back to normal for my 6700K@4.8GHz = +15505
    Power draw is also back to normal with this updated GUI versions. Nice :) Thanks @Futuremark_James
    http://www.3dmark.com/compare/fs/11095260/fs/11049261

    [IMG]

     

    Edit.  @Futuremark_James  Can you look at the ccustom settings in 3DM suite? I and several use this feature a lot... Every time I come back to  custom settings and want a new test EG. Only the physics or other subtest, I must choose what custom test I should run every time I launch 3DM. It should not be like this. Do what you've always have done with older 3DM11 ... Custom Setup is already put up as in the previous testing days before. This means that 3DM suite should remember previous custom settings for the next time you want to test EG physics or other subtests.

    @Mr. Fox Can you chime in and explain better if this isn't clear. I want a change for better custom settings like the older 3DM11. Thanks

    1 person likes this

    Share this comment


    Link to comment
    Share on other sites
    2 hours ago, Papusan said:

     @Futuremark_James  Can you look at the ccustom settings in 3DM suite? I and several use this feature a lot... Every time I come back to  custom settings and want a new test EG. Only the physics or other subtest, I must choose what custom test I should run every time I launch 3DM. It should not be like this. Do what you've always have done with older 3DM11 ... Custom Setup is already put up as in the previous testing days before. This means that 3DM suite should remember previous custom settings for the next time you want to test EG physics or other subtests.

    @Mr. Fox Can you chime in and explain better if this isn't clear. I want a change for better custom settings like the older 3DM11. Thanks

    You did a nice job of explaining it. And, I agree with you. I prefer using 3DMark 11 instead. One reason is that it is a better benchmark tool, but also because it does remember my custom settings. This is useful for testing purposes and having to manually change everything each time I open the benchmark because it does not remember my last used settings is inconvenient.

    1 person likes this

    Share this comment


    Link to comment
    Share on other sites
    56 minutes ago, Mr. Fox said:

    You did a nice job of explaining it. And, I agree with you. I prefer using 3DMark 11 instead. One reason is that it is a better benchmark tool, but also because it does remember my custom settings. This is useful for testing purposes and having to manually change everything each time I open the benchmark because it does not remember my last used settings is inconvenient.

    Bilderesultat for Pictures of thinking same

    Edited by Papusan
    1 person likes this

    Share this comment


    Link to comment
    Share on other sites
    Quote

    This means that 3DM suite should remember previous custom settings

     

    Thanks for the feedback. That sounds like a sensible idea. I'll pass it on to the team.

    1 person likes this

    Share this comment


    Link to comment
    Share on other sites

    As soon as any form of metrology (benchmark) starts to drift away from a reference standard it becomes impossible to draw comparisons from machine to machine or from one configuration to the next. 

     

    Everything needs to be traceable to a reference piece of hardware that does not change; that means the reference computer does not get operating system updates, drivers changes to any device or even the environment it is operated in (we know that temperature really does affect performance).

     

    For any of these companies that make benchmarking systems they need to adopt some "reference box" and the scoring systems need to be compensated against that reference box of the software tool becomes meaningless.

     

    In electrical engineering test environments we use NIST traceable standards for everything from voltage, capacitance, resistance and timing. If you do not have a reliable reference then you are back in the days when "one foot" meant the size of the king's shoe.

    Share this comment


    Link to comment
    Share on other sites
    On 13/12/2016 at 5:52 AM, Papusan said:

    Thanks for the help bro @Mr. Fox

    I have sent a new feedback mail to Futuremark v/jarlo Kokko for the 10th time... Futuremark have said long time ago to me in the mail that "They have reproduced in-house and investigation is ongoing" I have send them a lot of result for their investigation. Nothing happens as you can see in the pictures - links!!!

     And when they finally push out the new <FIXED> 3DM version after 3 months, so is the 3DM benchmark software in an even worse condition...

    Like the last time... New 3DM suite UI 2.2.3488 64 version out 9th Dec. = Fiasko!! Then they need to push out an even newer one because the trouble witht the first one out... 1 day later aka 10th Dec. The newest messed up come out <UI 2.2.3491 64>.

    Same mess happened last two time as well(I think in July and Aug). Futuremark have BIG problems with their 3DM Suite!!!

    See results. Both older UI versions 2.0.2067_64 and 2.0.2809_64 will give 15002 in Physics with 6700K@4.8GH and both 2 latest drivers from Nvidia!! Newer UI versjons of 3DM Suite will give up to 400 points lower physics in fire Strike. All tested with same Nvidia drivers, stock graphics and 4.8GHz on processor.

    Mine tests!!(Papusan)
    Tested with latest Nvidia driver 375.95
    http://www.3dmark.com/compare/fs/11036017/fs/11035883#

     

    GetAttachmentThumbnail?id=AQMkADAwATIwMTAwAC0wMzg3LWIyMTUtMDACLTAwCgBGAAADwkSjwPjSaEOmnZD9pA30agcA3iLIgJScN0%2B372ByGILsMAAAAgEJAAAA3iLIgJScN0%2B372ByGILsMAAAAGmQpAkAAAABEgAQAOqCxfKe7LdKqSR6puJCNvo%3D&thumbnailType=2&X-OWA-CANARY=opY4l8jSikGpqsCtzlNPKMA86CdFI9QYFSZL4xjbhinkp0FKNgP_JX-NikVY4V-9p469I5T1N3E.&token=d85380cf-2300-4072-b719-c64cd0c4c8f2&owa=outlook.live.com&isc=1

     

    Tested with latest Nvidia driver 376.19
    http://www.3dmark.com/compare/fs/11049261/fs/11057220

    GetAttachmentThumbnail?id=AQMkADAwATIwMTAwAC0wMzg3LWIyMTUtMDACLTAwCgBGAAADwkSjwPjSaEOmnZD9pA30agcA3iLIgJScN0%2B372ByGILsMAAAAgEJAAAA3iLIgJScN0%2B372ByGILsMAAAAGmQpAkAAAABEgAQAPQ0Tbk2MKpMr06M9Fq892A%3D&thumbnailType=2&X-OWA-CANARY=opY4l8jSikGpqsCtzlNPKMA86CdFI9QYFSZL4xjbhinkp0FKNgP_JX-NikVY4V-9p469I5T1N3E.&token=d85380cf-2300-4072-b719-c64cd0c4c8f2&owa=outlook.live.com&isc=1

     

     

    what is, hypothetically, you hacked your install inf file and renamed the driver version your installing to a different one, does it change the score ? I ask this because I;ve always wondered if they are in league with INTEL and NVidia in the way that they assign handicaps to older hardware and assign a "BOOST" score to newer / next gen once it's released, to sort of bloat numbers in thier favor of selling hardware.  They do after all design hardware to be replaced regularly, they are not trying to do us any favors with this stuff after all. Its all just a money gimmick !

    Share this comment


    Link to comment
    Share on other sites

    I have the exact same issue with my Alienware 15R3 (6820hk at 4.1+ghz and overclocked 1070 gtx): gradual degradation from 15500 points global score (19 000 graphics) to 15200 global score approximately.

    Share this comment


    Link to comment
    Share on other sites



    Create an account or sign in to comment

    You need to be a member in order to leave a comment

    Create an account

    Sign up for a new account in our community. It's easy!


    Register a new account

    Sign in

    Already have an account? Sign in here.


    Sign In Now

  • Similar Content

    • By Mr. Fox
      For those that are not already aware of the issue, the folks at Futuremark seem to be struggling to keep a consistent product in the latest 3DMark benchmark. In particular, Fire Strike. Sometime around the release of Time Spy things started getting screwy with Fire Strike and now it seems with every Fire Strike GUI version update the effect is progressively decreasing benchmark scores, and specifically the physics portion of the benchmark.
       
      Kudos to @Papusan for noticing this months ago and asking me to have a look at it. He has been going back and forth with Futuremark about the problem and it seems they are either ignoring him or perhaps they do not view it as a high priority issue. Or, maybe because most people running Fire Strike are not observant enough to notice, care, or ask questions they feel they don't need to fix it.
       
      Some people might say you cannot compare results across benchmark software versions, but that shouldn't hold water here. There is a leaderboard and searchable database of results that basically every benching enthusiast and PC reviewer relies on, and if there is not a very high degree of consistency between GUI versions the results in their database will become irrelevant, as will their leaderboard. The search filter does not have a field to filter by GUI version, so we can expect the results from the database and leaderboard to be increasingly misleading, inaccurate and unreliable over time. This certainly is not a desirable thing for what is supposedly the current defacto standard in PC benchmarks.
       
      You will notice from the examples posted below that with each new version of Fire Strike the scores get lower and lower. These examples are consecutive runs on the same day, same machine, and identical CPU and GPU settings. The only thing that changes is Fire Strike benchmark results degrade with newer versions. We need Futuremark to understand and correct this.
       
      http://www.3dmark.com/compare/fs/11047304/fs/11047179/fs/11047154
       
       
      Here is a similar example from @Papusan: http://www.3dmark.com/compare/fs/11036017/fs/11035883
       
      If you agree this is a problem and want it to be fixed, please complain to Futuremark and let them know they need to put the brakes on and not do anything else with 3DMark until they have this mess under control. Gimmicky features are one thing, but inconsistent benchmark results makes 3DMark unreliable.
       
      If you would like to do your own testing to validate the issue before contacting Futuremark, older versions of 3DMark are available for download from the TechPowerUp.com web site. 
       
      In case you're not good at simple math, here is a visual aid to show what the fuss is about.
       

       
      Update 12/13/2016:
      We would like to acknowledge that a representative of Futuremark has responded promptly to this article and provided an email address for those interested in communicating with them about the issue. We appreciate the accountability and responsiveness. 
      Update 12/15/2016:
      We sincerely are grateful for Futuremark's responsiveness. I provided additional test results to Mr. Kokko to corroborate the findings of @Papusan and they have released an update that is expected to resolve the issue. See the message from James below for more details.

      View full article