After spending most of the past decade without a decent computer, all laptops with GPUs more able at toasting bread than proper gaming, I finally cracked the spare Bitcoin piggy-bank and built my dream machine with an i7-4790k and Nvidia 970 GPU inside it.
I could play Witcher 3 at last, so many great games to catch up on. :-)
But before that I had to get the maximum performance the hardware can provide through overclocking.
The point is, I’m a nerd and nerds like to tweak things. We’re not the kind that puts up with bloated closed source software and crappy xmass tree GUIs. I thus needed a simple and snappy tool to achieve the purpose of overclocking my brand new GPU making it on par with a 980 model.
Let’s be honest here, the main contenders offenders would make the eyes of any sane person bleed instantly:
Those tools are respectively from MSI, EVGA, Gigabyte and Asus.
A quick look at the interface and features suggests they are very similar, probably all built upon the same toolkit, it’s called “RTHAL” in MSI Afterburner.
Anyways this is bad software and their authors should feel bad. Not everyone buying those graphics cards is a 14yo xXX_l33thaxor1ny0ma|\/|4_XXx who wants dragons and giant robots on their packaging.
There is no light and open source overclocking software for power users these days, mostly because GPU makers won’t publish their docs, the situation needs a fix.
Where do we start?
Nvidia has an API to talk to their Driver, at least under Windows, it’s conveniently named NvAPI and it has a documentation here: GPU Performance State Interface.
What would hit the hopeful coder square in the face when reading that is the very reduced set of functions available:
NVAPI_INTERFACE NvAPI_GPU_GetPstates20 (__in NvPhysicalGpuHandle hPhysicalGpu, __inout NV_GPU_PERF_PSTATES20_INFO *pPstatesInfo) NVAPI_INTERFACE NvAPI_GPU_GetCurrentPstate (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_PERF_PSTATE_ID *pCurrentPstate) NVAPI_INTERFACE NvAPI_GPU_GetDynamicPstatesInfoEx (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_DYNAMIC_PSTATES_INFO_EX *pDynamicPstatesInfoEx)
Yep, that also sucks, plenty of functions with “Get” in their names but almost none with “Set”.
Indeed this public API is incomplete, after lurking the interwebs it seems the full featured api, headers, libs and docs are provided under an NDA and there is not the slightest chance I could access that information legitimately I suppose.
I don’t have time to waste jumping through those hoops, creating accounts and whatnot either, I just want to overclock my GPU and the Internet for once doesn’t have anything of that sort readily available since RivaTuner which has never been open source in the first place.
So let’s grab a shovel and go deeper.
What do we know, what do we need?
We sure know that tools from cards makers can do overclocking through NvAPI by accessing the undocumented functions.
Probably if there’s some “public” Get…
NvAPI_GPU_GetPstates20
then there’s a “private” Set… hiding somewhere:
NvAPI_GPU_SetPstates20
Maybe this is more complicated than that for all we know, so let’s start by running MSI Afterburner inside Ollydbg and we’ll quickly land here by browsing the strings references:
“nvapi.dll” definitely gets loaded here using LoadLibrary/GetModuleHandle. We’re on the right track.
Now where exactly is that lib used? There could be thousands occurrences.
That’s simple, with the program running and the realtime graph disabled (it polls NvAPI constantly adding noise to the mass of API calls). we place a memory breakpoint on the .Text memory segment of the NVapi.dll inside MSI Afterburner’s process. (just hit F2 in the segments window when NvAPI is highlighted…).
Then we set the sliders in the MSI tool to get some negligible GPU underclock and hit the “apply” button. It breaks inside NvAPI… magic!
But wait, this isn’t the “overclocking” (SetPstates20()) function there, the symbol for the return pointer on the top of the stack shows something along the lines of “QueryInterface”.
Long story short, this “NvAPI_QueryInterface” function is the only exported function from the nvapi.dll
Its purpose is to take the ID of a function in the API and return a pointer to the actual code of the function in the mapped process. It probably serves as a convenient layer for not breaking the API across updates and also for obfuscating the entry points where the goods are to be found.
Actually if you get the NVapi SDK from Nvidia’s website you’ll find a linkable module inside the archive. It serves exactly no purpose, just acts as an “exports proxy”, it exports the name of all the public functions from the API, when the functions are called it retrieves the real pointer with the ID it holds and executes the real function.
Ultimately the end user/programmer doesn’t have to be aware of all those ID things, he would just call the public functions and link the module from the public SDK using the public headers.
You may already have guessed, I don’t want to proceed that way.
Hopefully if you look again at the previous screenshot of Ollydbg inside the QueryInterface() function you’ll find the sole INT argument to the function on top of the stack just under the return pointer, it’s 0xF4DAE6B. We’re getting closer!
Let’s continue runnning the program in olly and break a second time on NvAPI, we learn from the symbols floating around that MSI Afterburner just initiated a call to “Nv_SetPStates20()”. So 0xF4DAE6B is certainly the ID of the function we’re looking for.
Good we just need its prototype and arguments to be in turn able to declare and use it inside our own code.
Also a quick web search for 0xF4DAE6B yielded this very interesting result where an amazing Russian dude with only 2 messages on his stackoverflow.com profile still found a way to drop this sweet piece of data which looks disturbingly like what the NDA version of the API would be:
_NvAPI_Initialize 150E828h
_NvAPI_Unload 0D22BDD7Eh
_NvAPI_GetErrorMessage 6C2D048Ch
_NvAPI_GetInterfaceVersionString 1053FA5h
_NvAPI_GetDisplayDriverVersion 0F951A4D1h
_NvAPI_SYS_GetDriverAndBranchVersion 2926AAADh
_NvAPI_EnumNvidiaDisplayHandle 9ABDD40Dh
_NvAPI_EnumNvidiaUnAttachedDisplayHandle 20DE9260h
_NvAPI_EnumPhysicalGPUs 0E5AC921Fh
_NvAPI_EnumLogicalGPUs 48B3EA59h
_NvAPI_GetPhysicalGPUsFromDisplay 34EF9506h
_NvAPI_GetPhysicalGPUFromUnAttachedDisplay 5018ED61h
_NvAPI_CreateDisplayFromUnAttachedDisplay 63F9799Eh
_NvAPI_GetLogicalGPUFromDisplay 0EE1370CFh
_NvAPI_GetLogicalGPUFromPhysicalGPU 0ADD604D1h
_NvAPI_GetPhysicalGPUsFromLogicalGPU 0AEA3FA32h
_NvAPI_GetAssociatedNvidiaDisplayHandle 35C29134h
_NvAPI_DISP_GetAssociatedUnAttachedNvidiaDisplayHandle 0A70503B2h
_NvAPI_GetAssociatedNvidiaDisplayName 22A78B05h
_NvAPI_GetUnAttachedAssociatedDisplayName 4888D790h
_NvAPI_EnableHWCursor 2863148Dh
_NvAPI_DisableHWCursor 0AB163097h
_NvAPI_GetVBlankCounter 67B5DB55h
_NvAPI_SetRefreshRateOverride 3092AC32h
_NvAPI_GetAssociatedDisplayOutputId 0D995937Eh
_NvAPI_GetDisplayPortInfo 0C64FF367h
_NvAPI_SetDisplayPort 0FA13E65Ah
_NvAPI_GetHDMISupportInfo 6AE16EC3h
_NvAPI_DISP_EnumHDMIStereoModes 0D2CCF5D6h
_NvAPI_GetInfoFrame 9734F1Dh
_NvAPI_SetInfoFrame 69C6F365h
_NvAPI_SetInfoFrameState 67EFD887h
_NvAPI_GetInfoFrameState 41511594h
_NvAPI_Disp_InfoFrameControl 6067AF3Fh
_NvAPI_Disp_ColorControl 92F9D80Dh
_NvAPI_DISP_GetVirtualModeData 3230D69Ah
_NvAPI_DISP_OverrideDisplayModeList 291BFF2h
_NvAPI_GetDisplayDriverMemoryInfo 774AA982h
_NvAPI_GetDriverMemoryInfo 2DC95125h
_NvAPI_GetDVCInfo 4085DE45h
_NvAPI_SetDVCLevel 172409B4h
_NvAPI_GetDVCInfoEx 0E45002Dh
_NvAPI_SetDVCLevelEx 4A82C2B1h
_NvAPI_GetHUEInfo 95B64341h
_NvAPI_SetHUEAngle 0F5A0F22Ch
_NvAPI_GetImageSharpeningInfo 9FB063DFh
_NvAPI_SetImageSharpeningLevel 3FC9A59Ch
_NvAPI_D3D_GetCurrentSLIState 4B708B54h
_NvAPI_D3D9_RegisterResource 0A064BDFCh
_NvAPI_D3D9_UnregisterResource 0BB2B17AAh
_NvAPI_D3D9_AliasSurfaceAsTexture 0E5CEAE41h
_NvAPI_D3D9_StretchRectEx 22DE03AAh
_NvAPI_D3D9_ClearRT 332D3942h
_NvAPI_D3D_CreateQuery 5D19BCA4h
_NvAPI_D3D_DestroyQuery 0C8FF7258h
_NvAPI_D3D_Query_Begin 0E5A9AAE0h
_NvAPI_D3D_Query_End 2AC084FAh
_NvAPI_D3D_Query_GetData 0F8B53C69h
_NvAPI_D3D_Query_GetDataSize 0F2A54796h
_NvAPI_D3D_Query_GetType 4ACEEAF7h
_NvAPI_D3D_RegisterApp 0D44D3C4Eh
_NvAPI_D3D9_CreatePathContextNV 0A342F682h
_NvAPI_D3D9_DestroyPathContextNV 667C2929h
_NvAPI_D3D9_CreatePathNV 71329DF3h
_NvAPI_D3D9_DeletePathNV 73E0019Ah
_NvAPI_D3D9_PathVerticesNV 0C23DF926h
_NvAPI_D3D9_PathParameterfNV 0F7FF00C1h
_NvAPI_D3D9_PathParameteriNV 0FC31236Ch
_NvAPI_D3D9_PathMatrixNV 0D2F6C499h
_NvAPI_D3D9_PathDepthNV 0FCB16330h
_NvAPI_D3D9_PathClearDepthNV 157E45C4h
_NvAPI_D3D9_PathEnableDepthTestNV 0E99BA7F3h
_NvAPI_D3D9_PathEnableColorWriteNV 3E2804A2h
_NvAPI_D3D9_DrawPathNV 13199B3Dh
_NvAPI_D3D9_GetSurfaceHandle 0F2DD3F2h
_NvAPI_D3D9_GetOverlaySurfaceHandles 6800F5FCh
_NvAPI_D3D9_GetTextureHandle 0C7985ED5h
_NvAPI_D3D9_GpuSyncGetHandleSize 80C9FD3Bh
_NvAPI_D3D9_GpuSyncInit 6D6FDAD4h
_NvAPI_D3D9_GpuSyncEnd 754033F0h
_NvAPI_D3D9_GpuSyncMapTexBuffer 0CDE4A28Ah
_NvAPI_D3D9_GpuSyncMapSurfaceBuffer 2AB714ABh
_NvAPI_D3D9_GpuSyncMapVertexBuffer 0DBC803ECh
_NvAPI_D3D9_GpuSyncMapIndexBuffer 12EE68F2h
_NvAPI_D3D9_SetPitchSurfaceCreation 18CDF365h
_NvAPI_D3D9_GpuSyncAcquire 0D00B8317h
_NvAPI_D3D9_GpuSyncRelease 3D7A86BBh
_NvAPI_D3D9_GetCurrentRenderTargetHandle 22CAD61h
_NvAPI_D3D9_GetCurrentZBufferHandle 0B380F218h
_NvAPI_D3D9_GetIndexBufferHandle 0FC5A155Bh
_NvAPI_D3D9_GetVertexBufferHandle 72B19155h
_NvAPI_D3D9_CreateTexture 0D5E13573h
_NvAPI_D3D9_AliasPrimaryAsTexture 13C7112Eh
_NvAPI_D3D9_PresentSurfaceToDesktop 0F7029C5h
_NvAPI_D3D9_CreateVideoBegin 84C9D553h
_NvAPI_D3D9_CreateVideoEnd 0B476BF61h
_NvAPI_D3D9_CreateVideo 89FFD9A3h
_NvAPI_D3D9_FreeVideo 3111BED1h
_NvAPI_D3D9_PresentVideo 5CF7F862h
_NvAPI_D3D9_VideoSetStereoInfo 0B852F4DBh
_NvAPI_D3D9_SetGamutData 2BBDA32Eh
_NvAPI_D3D9_SetSurfaceCreationLayout 5609B86Ah
_NvAPI_D3D9_GetVideoCapabilities 3D596B93h
_NvAPI_D3D9_QueryVideoInfo 1E6634B3h
_NvAPI_D3D9_AliasPrimaryFromDevice 7C20C5BEh
_NvAPI_D3D9_SetResourceHint 905F5C27h
_NvAPI_D3D9_Lock 6317345Ch
_NvAPI_D3D9_Unlock 0C182027Eh
_NvAPI_D3D9_GetVideoState 0A4527BF8h
_NvAPI_D3D9_SetVideoState 0BD4BC56Fh
_NvAPI_D3D9_EnumVideoFeatures 1DB7C52Ch
_NvAPI_D3D9_GetSLIInfo 694BFF4Dh
_NvAPI_D3D9_SetSLIMode 0BFDC062Ch
_NvAPI_D3D9_QueryAAOverrideMode 0DDF5643Ch
_NvAPI_D3D9_VideoSurfaceEncryptionControl 9D2509EFh
_NvAPI_D3D9_DMA 962B8AF6h
_NvAPI_D3D9_EnableStereo 492A6954h
_NvAPI_D3D9_StretchRect 0AEAECD41h
_NvAPI_D3D9_CreateRenderTarget 0B3827C8h
_NvAPI_D3D9_NVFBC_GetStatus 0BD3EB475h
_NvAPI_D3D9_IFR_SetUpTargetBufferToSys 55255D05h
_NvAPI_D3D9_GPUBasedCPUSleep 0D504DDA7h
_NvAPI_D3D9_IFR_TransferRenderTarget 0AB7C2DCh
_NvAPI_D3D9_IFR_SetUpTargetBufferToNV12BLVideoSurface 0CFC92C15h
_NvAPI_D3D9_IFR_TransferRenderTargetToNV12BLVideoSurface 5FE72F64h
_NvAPI_D3D10_AliasPrimaryAsTexture 8AAC133Dh
_NvAPI_D3D10_SetPrimaryFlipChainCallbacks 73EB9329h
_NvAPI_D3D10_ProcessCallbacks 0AE9C2019h
_NvAPI_D3D10_GetRenderedCursorAsBitmap 0CAC3CE5Dh
_NvAPI_D3D10_BeginShareResource 35233210h
_NvAPI_D3D10_BeginShareResourceEx 0EF303A9Dh
_NvAPI_D3D10_EndShareResource 0E9C5853h
_NvAPI_D3D10_SetDepthBoundsTest 4EADF5D2h
_NvAPI_D3D10_CreateDevice 2DE11D61h
_NvAPI_D3D10_CreateDeviceAndSwapChain 5B803DAFh
_NvAPI_D3D11_CreateDevice 6A16D3A0h
_NvAPI_D3D11_CreateDeviceAndSwapChain 0BB939EE5h
_NvAPI_D3D11_BeginShareResource 121BDC6h
_NvAPI_D3D11_EndShareResource 8FFB8E26h
_NvAPI_D3D11_SetDepthBoundsTest 7AAF7A04h
_NvAPI_GPU_GetShaderPipeCount 63E2F56Fh
_NvAPI_GPU_GetShaderSubPipeCount 0BE17923h
_NvAPI_GPU_GetPartitionCount 86F05D7Ah
_NvAPI_GPU_GetMemPartitionMask 329D77CDh
_NvAPI_GPU_GetTPCMask 4A35DF54h
_NvAPI_GPU_GetSMMask 0EB7AF173h
_NvAPI_GPU_GetTotalTPCCount 4E2F76A8h
_NvAPI_GPU_GetTotalSMCount 0AE5FBCFEh
_NvAPI_GPU_GetTotalSPCount 0B6D62591h
_NvAPI_GPU_GetGpuCoreCount 0C7026A87h
_NvAPI_GPU_GetAllOutputs 7D554F8Eh
_NvAPI_GPU_GetConnectedOutputs 1730BFC9h
_NvAPI_GPU_GetConnectedSLIOutputs 680DE09h
_NvAPI_GPU_GetConnectedDisplayIds 78DBA2h
_NvAPI_GPU_GetAllDisplayIds 785210A2h
_NvAPI_GPU_GetConnectedOutputsWithLidState 0CF8CAF39h
_NvAPI_GPU_GetConnectedSLIOutputsWithLidState 96043CC7h
_NvAPI_GPU_GetSystemType 0BAAABFCCh
_NvAPI_GPU_GetActiveOutputs 0E3E89B6Fh
_NvAPI_GPU_GetEDID 37D32E69h
_NvAPI_GPU_SetEDID 0E83D6456h
_NvAPI_GPU_GetOutputType 40A505E4h
_NvAPI_GPU_GetDeviceDisplayMode 0D2277E3Ah
_NvAPI_GPU_GetFlatPanelInfo 36CFF969h
_NvAPI_GPU_ValidateOutputCombination 34C9C2D4h
_NvAPI_GPU_GetConnectorInfo 4ECA2C10h
_NvAPI_GPU_GetFullName 0CEEE8E9Fh
_NvAPI_GPU_GetPCIIdentifiers 2DDFB66Eh
_NvAPI_GPU_GetGPUType 0C33BAEB1h
_NvAPI_GPU_GetBusType 1BB18724h
_NvAPI_GPU_GetBusId 1BE0B8E5h
_NvAPI_GPU_GetBusSlotId 2A0A350Fh
_NvAPI_GPU_GetIRQ 0E4715417h
_NvAPI_GPU_GetVbiosRevision 0ACC3DA0Ah
_NvAPI_GPU_GetVbiosOEMRevision 2D43FB31h
_NvAPI_GPU_GetVbiosVersionString 0A561FD7Dh
_NvAPI_GPU_GetAGPAperture 6E042794h
_NvAPI_GPU_GetCurrentAGPRate 0C74925A0h
_NvAPI_GPU_GetCurrentPCIEDownstreamWidth 0D048C3B1h
_NvAPI_GPU_GetPhysicalFrameBufferSize 46FBEB03h
_NvAPI_GPU_GetVirtualFrameBufferSize 5A04B644h
_NvAPI_GPU_GetQuadroStatus 0E332FA47h
_NvAPI_GPU_GetBoardInfo 22D54523h
_NvAPI_GPU_GetRamType 57F7CAACh
_NvAPI_GPU_GetFBWidthAndLocation 11104158h
_NvAPI_GPU_GetAllClockFrequencies 0DCB616C3h
_NvAPI_GPU_GetPerfClocks 1EA54A3Bh
_NvAPI_GPU_SetPerfClocks 7BCF4ACh
_NvAPI_GPU_GetCoolerSettings 0DA141340h
_NvAPI_GPU_SetCoolerLevels 891FA0AEh
_NvAPI_GPU_RestoreCoolerSettings 8F6ED0FBh
_NvAPI_GPU_GetCoolerPolicyTable 518A32Ch
_NvAPI_GPU_SetCoolerPolicyTable 987947CDh
_NvAPI_GPU_RestoreCoolerPolicyTable 0D8C4FE63h
_NvAPI_GPU_GetPstatesInfo 0BA94C56Eh
_NvAPI_GPU_GetPstatesInfoEx 843C0256h
_NvAPI_GPU_SetPstatesInfo 0CDF27911h
_NvAPI_GPU_GetPstates20 6FF81213h
_NvAPI_GPU_SetPstates20 0F4DAE6Bh
_NvAPI_GPU_GetCurrentPstate 927DA4F6h
_NvAPI_GPU_GetPstateClientLimits 88C82104h
_NvAPI_GPU_SetPstateClientLimits 0FDFC7D49h
_NvAPI_GPU_EnableOverclockedPstates 0B23B70EEh
_NvAPI_GPU_EnableDynamicPstates 0FA579A0Fh
_NvAPI_GPU_GetDynamicPstatesInfoEx 60DED2EDh
_NvAPI_GPU_GetVoltages 7D656244h
_NvAPI_GPU_GetThermalSettings 0E3640A56h
_NvAPI_GPU_SetDitherControl 0DF0DFCDDh
_NvAPI_GPU_GetDitherControl 932AC8FBh
_NvAPI_GPU_GetColorSpaceConversion 8159E87Ah
_NvAPI_GPU_SetColorSpaceConversion 0FCABD23Ah
_NvAPI_GetTVOutputInfo 30C805D5h
_NvAPI_GetTVEncoderControls 5757474Ah
_NvAPI_SetTVEncoderControls 0CA36A3ABh
_NvAPI_GetTVOutputBorderColor 6DFD1C8Ch
_NvAPI_SetTVOutputBorderColor 0AED02700h
_NvAPI_GetDisplayPosition 6BB1EE5Dh
_NvAPI_SetDisplayPosition 57D9060Fh
_NvAPI_GetValidGpuTopologies 5DFAB48Ah
_NvAPI_GetInvalidGpuTopologies 15658BE6h
_NvAPI_SetGpuTopologies 25201F3Dh
_NvAPI_GPU_GetPerGpuTopologyStatus 0A81F8992h
_NvAPI_SYS_GetChipSetTopologyStatus 8A50F126h
_NvAPI_GPU_Get_DisplayPort_DongleInfo 76A70E8Dh
_NvAPI_I2CRead 2FDE12C5h
_NvAPI_I2CWrite 0E812EB07h
_NvAPI_I2CWriteEx 283AC65Ah
_NvAPI_I2CReadEx 4D7B0709h
_NvAPI_GPU_GetPowerMizerInfo 76BFA16Bh
_NvAPI_GPU_SetPowerMizerInfo 50016C78h
_NvAPI_GPU_GetVoltageDomainsStatus 0C16C7E2Ch
_NvAPI_GPU_ClientPowerTopologyGetInfo 0A4DFD3F2h
_NvAPI_GPU_ClientPowerTopologyGetStatus 0EDCF624Eh
_NvAPI_GPU_ClientPowerPoliciesGetInfo 34206D86h
_NvAPI_GPU_ClientPowerPoliciesGetStatus 70916171h
_NvAPI_GPU_ClientPowerPoliciesSetStatus 0AD95F5EDh
_NvAPI_GPU_WorkstationFeatureSetup 6C1F3FE4h
_NvAPI_SYS_GetChipSetInfo 53DABBCAh
_NvAPI_SYS_GetLidAndDockInfo 0CDA14D8Ah
_NvAPI_OGL_ExpertModeSet 3805EF7Ah
_NvAPI_OGL_ExpertModeGet 22ED9516h
_NvAPI_OGL_ExpertModeDefaultsSet 0B47A657Eh
_NvAPI_OGL_ExpertModeDefaultsGet 0AE921F12h
_NvAPI_SetDisplaySettings 0E04F3D86h
_NvAPI_GetDisplaySettings 0DC27D5D4h
_NvAPI_GetTiming 0AFC4833Eh
_NvAPI_DISP_GetMonitorCapabilities 3B05C7E1h
_NvAPI_EnumCustomDisplay 42892957h
_NvAPI_TryCustomDisplay 0BF6C1762h
_NvAPI_RevertCustomDisplayTrial 854BA405h
_NvAPI_DeleteCustomDisplay 0E7CB998Dh
_NvAPI_SaveCustomDisplay 0A9062C78h
_NvAPI_QueryUnderscanCap 61D7B624h
_NvAPI_EnumUnderscanConfig 4144111Ah
_NvAPI_DeleteUnderscanConfig 0F98854C8h
_NvAPI_SetUnderscanConfig 3EFADA1Dh
_NvAPI_GetDisplayFeatureConfig 8E985CCDh
_NvAPI_SetDisplayFeatureConfig 0F36A668Dh
_NvAPI_GetDisplayFeatureConfigDefaults 0F5F4D01h
_NvAPI_SetView 957D7B6h
_NvAPI_GetView 0D6B99D89h
_NvAPI_SetViewEx 6B89E68h
_NvAPI_GetViewEx 0DBBC0AF4h
_NvAPI_GetSupportedViews 66FB7FC0h
_NvAPI_GetHDCPLinkParameters 0B3BB0772h
_NvAPI_Disp_DpAuxChannelControl 8EB56969h
_NvAPI_SetHybridMode 0FB22D656h
_NvAPI_GetHybridMode 0E23B68C1h
_NvAPI_Coproc_GetCoprocStatus 1EFC3957h
_NvAPI_Coproc_SetCoprocInfoFlagsEx 0F4C863ACh
_NvAPI_Coproc_GetCoprocInfoFlagsEx 69A9874Dh
_NvAPI_Coproc_NotifyCoprocPowerState 0CADCB956h
_NvAPI_Coproc_GetApplicationCoprocInfo 79232685h
_NvAPI_GetVideoState 1C5659CDh
_NvAPI_SetVideoState 54FE75Ah
_NvAPI_SetFrameRateNotify 18919887h
_NvAPI_SetPVExtName 4FEEB498h
_NvAPI_GetPVExtName 2F5B08E0h
_NvAPI_SetPVExtProfile 8354A8F4h
_NvAPI_GetPVExtProfile 1B1B9A16h
_NvAPI_VideoSetStereoInfo 97063269h
_NvAPI_VideoGetStereoInfo 8E1F8CFEh
_NvAPI_Mosaic_GetSupportedTopoInfo 0FDB63C81h
_NvAPI_Mosaic_GetTopoGroup 0CB89381Dh
_NvAPI_Mosaic_GetOverlapLimits 989685F0h
_NvAPI_Mosaic_SetCurrentTopo 9B542831h
_NvAPI_Mosaic_GetCurrentTopo 0EC32944Eh
_NvAPI_Mosaic_EnableCurrentTopo 5F1AA66Ch
_NvAPI_Mosaic_SetGridTopology 3F113C77h
_NvAPI_Mosaic_GetMosaicCapabilities 0DA97071Eh
_NvAPI_Mosaic_GetDisplayCapabilities 0D58026B9h
_NvAPI_Mosaic_EnumGridTopologies 0A3C55220h
_NvAPI_Mosaic_GetDisplayViewportsByResolution 0DC6DC8D3h
_NvAPI_Mosaic_GetMosaicViewports 7EBA036h
_NvAPI_Mosaic_SetDisplayGrids 4D959A89h
_NvAPI_Mosaic_ValidateDisplayGridsWithSLI 1ECFD263h
_NvAPI_Mosaic_ValidateDisplayGrids 0CF43903Dh
_NvAPI_Mosaic_EnumDisplayModes 78DB97D7h
_NvAPI_Mosaic_ChooseGpuTopologies 0B033B140h
_NvAPI_Mosaic_EnumDisplayGrids 0DF2887AFh
_NvAPI_GetSupportedMosaicTopologies 410B5C25h
_NvAPI_GetCurrentMosaicTopology 0F60852BDh
_NvAPI_SetCurrentMosaicTopology 0D54B8989h
_NvAPI_EnableCurrentMosaicTopology 74073CC9h
_NvAPI_QueryNonMigratableApps 0BB9EF1C3h
_NvAPI_GPU_QueryActiveApps 65B1C5F5h
_NvAPI_Hybrid_QueryUnblockedNonMigratableApps 5F35BCB5h
_NvAPI_Hybrid_QueryBlockedMigratableApps 0F4C2F8CCh
_NvAPI_Hybrid_SetAppMigrationState 0FA0B9A59h
_NvAPI_Hybrid_IsAppMigrationStateChangeable 584CB0B6h
_NvAPI_GPU_GPIOQueryLegalPins 0FAB69565h
_NvAPI_GPU_GPIOReadFromPin 0F5E10439h
_NvAPI_GPU_GPIOWriteToPin 0F3B11E68h
_NvAPI_GPU_GetHDCPSupportStatus 0F089EEF5h
_NvAPI_SetTopologyFocusDisplayAndView 0A8064F9h
_NvAPI_Stereo_CreateConfigurationProfileRegistryKey 0BE7692ECh
_NvAPI_Stereo_DeleteConfigurationProfileRegistryKey 0F117B834h
_NvAPI_Stereo_SetConfigurationProfileValue 24409F48h
_NvAPI_Stereo_DeleteConfigurationProfileValue 49BCEECFh
_NvAPI_Stereo_Enable 239C4545h
_NvAPI_Stereo_Disable 2EC50C2Bh
_NvAPI_Stereo_IsEnabled 348FF8E1h
_NvAPI_Stereo_GetStereoCaps 0DFC063B7h
_NvAPI_Stereo_GetStereoSupport 296C434Dh
_NvAPI_Stereo_CreateHandleFromIUnknown 0AC7E37F4h
_NvAPI_Stereo_DestroyHandle 3A153134h
_NvAPI_Stereo_Activate 0F6A1AD68h
_NvAPI_Stereo_Deactivate 2D68DE96h
_NvAPI_Stereo_IsActivated 1FB0BC30h
_NvAPI_Stereo_GetSeparation 451F2134h
_NvAPI_Stereo_SetSeparation 5C069FA3h
_NvAPI_Stereo_DecreaseSeparation 0DA044458h
_NvAPI_Stereo_IncreaseSeparation 0C9A8ECECh
_NvAPI_Stereo_GetConvergence 4AB00934h
_NvAPI_Stereo_SetConvergence 3DD6B54Bh
_NvAPI_Stereo_DecreaseConvergence 4C87E317h
_NvAPI_Stereo_IncreaseConvergence 0A17DAABEh
_NvAPI_Stereo_GetFrustumAdjustMode 0E6839B43h
_NvAPI_Stereo_SetFrustumAdjustMode 7BE27FA2h
_NvAPI_Stereo_CaptureJpegImage 932CB140h
_NvAPI_Stereo_CapturePngImage 8B7E99B5h
_NvAPI_Stereo_ReverseStereoBlitControl 3CD58F89h
_NvAPI_Stereo_SetNotificationMessage 6B9B409Eh
_NvAPI_Stereo_SetActiveEye 96EEA9F8h
_NvAPI_Stereo_SetDriverMode 5E8F0BECh
_NvAPI_Stereo_GetEyeSeparation 0CE653127h
_NvAPI_Stereo_IsWindowedModeSupported 40C8ED5Eh
_NvAPI_Stereo_AppHandShake 8C610BDAh
_NvAPI_Stereo_HandShake_Trigger_Activation 0B30CD1A7h
_NvAPI_Stereo_HandShake_Message_Control 315E0EF0h
_NvAPI_Stereo_SetSurfaceCreationMode 0F5DCFCBAh
_NvAPI_Stereo_GetSurfaceCreationMode 36F1C736h
_NvAPI_Stereo_Debug_WasLastDrawStereoized 0ED4416C5h
_NvAPI_Stereo_ForceToScreenDepth 2D495758h
_NvAPI_Stereo_SetVertexShaderConstantF 416C07B3h
_NvAPI_Stereo_SetVertexShaderConstantB 5268716Fh
_NvAPI_Stereo_SetVertexShaderConstantI 7923BA0Eh
_NvAPI_Stereo_GetVertexShaderConstantF 622FDC87h
_NvAPI_Stereo_GetVertexShaderConstantB 712BAA5Bh
_NvAPI_Stereo_GetVertexShaderConstantI 5A60613Ah
_NvAPI_Stereo_SetPixelShaderConstantF 0A9657F32h
_NvAPI_Stereo_SetPixelShaderConstantB 0BA6109EEh
_NvAPI_Stereo_SetPixelShaderConstantI 912AC28Fh
_NvAPI_Stereo_GetPixelShaderConstantF 0D4974572h
_NvAPI_Stereo_GetPixelShaderConstantB 0C79333AEh
_NvAPI_Stereo_GetPixelShaderConstantI 0ECD8F8CFh
_NvAPI_Stereo_SetDefaultProfile 44F0ECD1h
_NvAPI_Stereo_GetDefaultProfile 624E21C2h
_NvAPI_Stereo_Is3DCursorSupported 0D7C9EC09h
_NvAPI_Stereo_GetCursorSeparation 72162B35h
_NvAPI_Stereo_SetCursorSeparation 0FBC08FC1h
_NvAPI_VIO_GetCapabilities 1DC91303h
_NvAPI_VIO_Open 44EE4841h
_NvAPI_VIO_Close 0D01BD237h
_NvAPI_VIO_Status 0E6CE4F1h
_NvAPI_VIO_SyncFormatDetect 118D48A3h
_NvAPI_VIO_GetConfig 0D34A789Bh
_NvAPI_VIO_SetConfig 0E4EEC07h
_NvAPI_VIO_SetCSC 0A1EC8D74h
_NvAPI_VIO_GetCSC 7B0D72A3h
_NvAPI_VIO_SetGamma 964BF452h
_NvAPI_VIO_GetGamma 51D53D06h
_NvAPI_VIO_SetSyncDelay 2697A8D1h
_NvAPI_VIO_GetSyncDelay 462214A9h
_NvAPI_VIO_GetPCIInfo 0B981D935h
_NvAPI_VIO_IsRunning 96BD040Eh
_NvAPI_VIO_Start 0CDE8E1A3h
_NvAPI_VIO_Stop 6BA2A5D6h
_NvAPI_VIO_IsFrameLockModeCompatible 7BF0A94Dh
_NvAPI_VIO_EnumDevices 0FD7C5557h
_NvAPI_VIO_QueryTopology 869534E2h
_NvAPI_VIO_EnumSignalFormats 0EAD72FE4h
_NvAPI_VIO_EnumDataFormats 221FA8E8h
_NvAPI_GPU_GetTachReading 5F608315h
_NvAPI_3D_GetProperty 8061A4B1h
_NvAPI_3D_SetProperty 0C9175E8Dh
_NvAPI_3D_GetPropertyRange 0B85DE27Ch
_NvAPI_GPS_GetPowerSteeringStatus 540EE82Eh
_NvAPI_GPS_SetPowerSteeringStatus 9723D3A2h
_NvAPI_GPS_SetVPStateCap 68888EB4h
_NvAPI_GPS_GetVPStateCap 71913023h
_NvAPI_GPS_GetThermalLimit 583113EDh
_NvAPI_GPS_SetThermalLimit 0C07E210Fh
_NvAPI_GPS_GetPerfSensors 271C1109h
_NvAPI_SYS_GetDisplayIdFromGpuAndOutputId 8F2BAB4h
_NvAPI_SYS_GetGpuAndOutputIdFromDisplayId 112BA1A5h
_NvAPI_DISP_GetDisplayIdByDisplayName 0AE457190h
_NvAPI_DISP_GetGDIPrimaryDisplayId 1E9D8A31h
_NvAPI_DISP_GetDisplayConfig 11ABCCF8h
_NvAPI_DISP_SetDisplayConfig 5D8CF8DEh
_NvAPI_GPU_GetPixelClockRange 66AF10B7h
_NvAPI_GPU_SetPixelClockRange 5AC7F8E5h
_NvAPI_GPU_GetECCStatusInfo 0CA1DDAF3h
_NvAPI_GPU_GetECCErrorInfo 0C71F85A6h
_NvAPI_GPU_ResetECCErrorInfo 0C02EEC20h
_NvAPI_GPU_GetECCConfigurationInfo 77A796F3h
_NvAPI_GPU_SetECCConfiguration 1CF639D9h
_NvAPI_D3D1x_CreateSwapChain 1BC21B66h
_NvAPI_D3D9_CreateSwapChain 1A131E09h
_NvAPI_D3D_SetFPSIndicatorState 0A776E8DBh
_NvAPI_D3D9_Present 5650BEBh
_NvAPI_D3D9_QueryFrameCount 9083E53Ah
_NvAPI_D3D9_ResetFrameCount 0FA6A0675h
_NvAPI_D3D9_QueryMaxSwapGroup 5995410Dh
_NvAPI_D3D9_QuerySwapGroup 0EBA4D232h
_NvAPI_D3D9_JoinSwapGroup 7D44BB54h
_NvAPI_D3D9_BindSwapBarrier 9C39C246h
_NvAPI_D3D1x_Present 3B845A1h
_NvAPI_D3D1x_QueryFrameCount 9152E055h
_NvAPI_D3D1x_ResetFrameCount 0FBBB031Ah
_NvAPI_D3D1x_QueryMaxSwapGroup 9BB9D68Fh
_NvAPI_D3D1x_QuerySwapGroup 407F67AAh
_NvAPI_D3D1x_JoinSwapGroup 14610CD7h
_NvAPI_D3D1x_BindSwapBarrier 9DE8C729h
_NvAPI_SYS_VenturaGetState 0CB7C208Dh
_NvAPI_SYS_VenturaSetState 0CE2E9D9h
_NvAPI_SYS_VenturaGetCoolingBudget 0C9D86E33h
_NvAPI_SYS_VenturaSetCoolingBudget 85FF5A15h
_NvAPI_SYS_VenturaGetPowerReading 63685979h
_NvAPI_DISP_GetDisplayBlankingState 63E5D8DBh
_NvAPI_DISP_SetDisplayBlankingState 1E17E29Bh
_NvAPI_DRS_CreateSession 694D52Eh
_NvAPI_DRS_DestroySession 0DAD9CFF8h
_NvAPI_DRS_LoadSettings 375DBD6Bh
_NvAPI_DRS_SaveSettings 0FCBC7E14h
_NvAPI_DRS_LoadSettingsFromFile 0D3EDE889h
_NvAPI_DRS_SaveSettingsToFile 2BE25DF8h
_NvAPI_DRS_CreateProfile 0CC176068h
_NvAPI_DRS_DeleteProfile 17093206h
_NvAPI_DRS_SetCurrentGlobalProfile 1C89C5DFh
_NvAPI_DRS_GetCurrentGlobalProfile 617BFF9Fh
_NvAPI_DRS_GetProfileInfo 61CD6FD6h
_NvAPI_DRS_SetProfileInfo 16ABD3A9h
_NvAPI_DRS_FindProfileByName 7E4A9A0Bh
_NvAPI_DRS_EnumProfiles 0BC371EE0h
_NvAPI_DRS_GetNumProfiles 1DAE4FBCh
_NvAPI_DRS_CreateApplication 4347A9DEh
_NvAPI_DRS_DeleteApplicationEx 0C5EA85A1h
_NvAPI_DRS_DeleteApplication 2C694BC6h
_NvAPI_DRS_GetApplicationInfo 0ED1F8C69h
_NvAPI_DRS_EnumApplications 7FA2173Ah
_NvAPI_DRS_FindApplicationByName 0EEE566B2h
_NvAPI_DRS_SetSetting 577DD202h
_NvAPI_DRS_GetSetting 73BF8338h
_NvAPI_DRS_EnumSettings 0AE3039DAh
_NvAPI_DRS_EnumAvailableSettingIds 0F020614Ah
_NvAPI_DRS_EnumAvailableSettingValues 2EC39F90h
_NvAPI_DRS_GetSettingIdFromName 0CB7309CDh
_NvAPI_DRS_GetSettingNameFromId 0D61CBE6Eh
_NvAPI_DRS_DeleteProfileSetting 0E4A26362h
_NvAPI_DRS_RestoreAllDefaults 5927B094h
_NvAPI_DRS_RestoreProfileDefault 0FA5F6134h
_NvAPI_DRS_RestoreProfileDefaultSetting 53F0381Eh
_NvAPI_DRS_GetBaseProfile 0DA8466A0h
_NvAPI_Event_RegisterCallback 0E6DBEA69h
_NvAPI_Event_UnregisterCallback 0DE1F9B45h
_NvAPI_GPU_GetCurrentThermalLevel 0D2488B79h
_NvAPI_GPU_GetCurrentFanSpeedLevel 0BD71F0C9h
_NvAPI_GPU_SetScanoutIntensity 0A57457A4h
_NvAPI_GPU_SetScanoutWarping 0B34BAB4Fh
_NvAPI_GPU_GetScanoutConfiguration 6A9F5B63h
_NvAPI_DISP_SetHCloneTopology 61041C24h
_NvAPI_DISP_GetHCloneTopology 47BAD137h
_NvAPI_DISP_ValidateHCloneTopology 5F4C2664h
_NvAPI_GPU_GetPerfDecreaseInfo 7F7F4600h
_NvAPI_GPU_QueryIlluminationSupport 0A629DA31h
_NvAPI_GPU_GetIllumination 9A1B9365h
_NvAPI_GPU_SetIllumination 254A187h
_NvAPI_D3D1x_IFR_SetUpTargetBufferToSys 473F7828h
_NvAPI_D3D1x_IFR_TransferRenderTarget 9FBAE4EBh
“_NvAPI_GPU_SetPstates20 0F4DAE6Bh”, it checks, we’re definitely heading the correct way!
In IDApro we can also check the Xrefs from the NvQueryInterface and we land at the start of the data section with a big array of INTs grouped by pairs, each comprises the address of a function and the associated Nvidia function ID:
Then again, IDs and addresses are valid according to the information we already have.
It means we are now sure of the location of the location for “GetPstates20” and “SetPstates20”. We can break directly inside them at will. Let’s do that in IDA after importing the nvapi.h headers so IDA knows about the structs in use: grab the pointer for the second argument on the stack just when entering “GetPstates20”, dereference it and apply the type of an “NV_GPU_PERF_PSTATES20_INFO_V1” struct to it.
Now all those apparently garbage values are starting to make sense.
We can confirm everything is correct as we expected by comparing one of the values to an authoritative measurement. Here 0xD8ACC stands for the GPU vCore represented as µVolts. It is 887500 in base 10, meaning 887.5mV or 0.8875V. The GPU-Z tool reports a similar value.
It seems we’re doing fine:
For good measure let’s go back a bit and put a conditional logging breakpoint in Olly at the beginning of the “QueryInterface” function in order to log ALL the function IDs successively requested by MSI Afterburner. Just in case things don’t go smoothly and we encounter a difficult pipeline setup before being able to overclock the GPU.
_NvAPI_Initialize = 0150E828
718006A0 COND: offset = 33C7358C
718006A0 COND: offset = 593E8644
_NvAPI_SYS_GetDriverAndBranchVersion = 2926AAAD
_NvAPI_EnumPhysicalGPUs = E5AC921F
718006A0 COND: offset = 6533EA3E
_NvAPI_GPU_GetBusId = 1BE0B8E5
_NvAPI_GPU_GetBusSlotId = 2A0A350F
_NvAPI_GPU_GetPCIIdentifiers = 2DDFB66E
_NvAPI_DRS_EnumAvailableSettingIds = F020614A
_NvAPI_DRS_CreateSession = 0694D52E
_NvAPI_DRS_LoadSettings = 375DBD6B
_NvAPI_DRS_GetBaseProfile = DA8466A0
_NvAPI_DRS_GetSetting = 73BF8338
_NvAPI_DRS_DestroySession = DAD9CFF8
_NvAPI_GPU_ClientPowerTopologyGetStatus = EDCF624E
_NvAPI_GPU_GetThermalSettings = E3640A56
_NvAPI_GPU_GetDynamicPstatesInfoEx = 60DED2ED
_NvAPI_GPU_GetCoolerSettings = DA141340
_NvAPI_GPU_GetAllClockFrequencies = DCB616C3
_NvAPI_GPU_GetPstates20 = 6FF81213
718006A0 COND: offset = 07F9B368
718006A0 COND: offset = 409D9841
_NvAPI_GPU_ClientPowerPoliciesGetInfo = 34206D86
718006A0 COND: offset = 0D258BB5
_NvAPI_GPU_GetSystemType = BAAABFCC
_NvAPI_GPU_GetFullName = CEEE8E9F
718006A0 COND: offset = 3D358A0C
718006A0 COND: offset = D988F0F3
_NvAPI_GPU_GetVbiosVersionString = A561FD7D
_NvAPI_GPU_SetPstates20 = 0F4DAE6B
_NvAPI_Unload = D22BDD7E
Despite the stackoverflow post being somewhat outdated or incomplete we can still name the majority of the functions called.
Reading through that quickly shows the obvious things one would expect, init the NVapi, get various informations, finally call SetPstates20 (when we hit a breakpoint attempting some small underclock) and clean up the API.
Considering “GetVbiosVersionString” is called just before the overclocking related function and it’s a purely GUI/dashboard info feature we can safely assume that no particular setup is required, we just need the correct arguments to call said function.
Reversing the function’s arguments
This one was supposed to be hell but actually it went better than previous thought:
– Most NvAPI functions including GetPstates20 take a physical GPU handle as their first argument.
– GetPstates20’s second argument is a struct for storing Pstates and it is documented in the public NvAPI headers.
– A quick look at the code calling “SetPstates20” in MSI afterburner shows 2 pushed arguments before the call. The first one is “0x100” for both the GET and SET functions, it is the handle for our GPU#0.
It is then highly likely that “SetPstates20” will take the same kind of struct as “GetPstates20” for its second argument, with a few edited values.
Let’s get coding
First, let’s isolate the data structures we need from the NvAPI headers because there are far too many lines in that file and I’m lazy to the point that scrolling hurts my finger. Also those are mostly ints so we’ll get rid of all the fancy names and make them regular uint/int for readability.
typedef unsigned long NvU32; typedef struct { NvU32 version; NvU32 ClockType:2; NvU32 reserved:22; NvU32 reserved1:8; struct { NvU32 bIsPresent:1; NvU32 reserved:31; NvU32 frequency; }domain[32]; } NV_GPU_CLOCK_FREQUENCIES_V2; typedef struct { int value; struct { int mindelta; int maxdelta; } valueRange; } NV_GPU_PERF_PSTATES20_PARAM_DELTA; typedef struct { NvU32 domainId; NvU32 typeId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PERF_PSTATES20_PARAM_DELTA freqDelta_kHz; union { struct { NvU32 freq_kHz; } single; struct { NvU32 minFreq_kHz; NvU32 maxFreq_kHz; NvU32 domainId; NvU32 minVoltage_uV; NvU32 maxVoltage_uV; } range; } data; } NV_GPU_PSTATE20_CLOCK_ENTRY_V1; typedef struct { NvU32 domainId; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 volt_uV; int voltDelta_uV; } NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1; typedef struct { NvU32 version; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 numPstates; NvU32 numClocks; NvU32 numBaseVoltages; struct { NvU32 pstateId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PSTATE20_CLOCK_ENTRY_V1 clocks[8]; NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1 baseVoltages[4]; } pstates[16]; } NV_GPU_PERF_PSTATES20_INFO_V1;
Then we need prototypes for the functions we’ll use. Remember we won’t call the provided exports from the NvAPI lib inside the SDK but rather retrieve the function pointers directly from the running nvapi.dll and execute them as such.
A handful of convenient function prototypes to get some infos, retrieve clocks and setting them up. Some of them can be found inside the public API, the others are probably from the NDA version. We use the same techniques as mentionned earlier to get to know about them:
typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset); typedef int (*NvAPI_Initialize_t)(); typedef int (*NvAPI_Unload_t)(); typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count); typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype); typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname); typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize); typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype); typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname); typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info); NvAPI_QueryInterface_t NvQueryInterface = 0; NvAPI_Initialize_t NvInit = 0; NvAPI_Unload_t NvUnload = 0; NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0; NvAPI_GPU_GetSystemType_t NvGetSysType = 0; NvAPI_GPU_GetFullName_t NvGetName = 0; NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0; NvAPI_GPU_GetRamType_t NvGetMemType = 0; NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0; NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0; NvAPI_GPU_GetPstates20_t NvGetPstates = 0; NvAPI_GPU_SetPstates20_t NvSetPstates = 0;
Time for the main() function, the code should be fairly short and this is just a PoC or whatever, brace yourself for screaming KNF nazis.
We’ll need those variables, they are of lesser importance, just the last line is a requirement.
“NV_GPU_PERF_PSTATES20_INFO_V1” is the root struct holding all the clocking and power data for the selected gpu handle. The size of this struct is 0x1c94, for some reason Nvidia decided to use that as the “version” field after adding 0x10000 to it so we set that field to 0x11c94 or the subsequent calls using the structure will return a cryptic error code.
int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0; int *hdlGPU[64]={0}, *buf=0; char sysname[64]={0}, biosname[64]={0}; NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info; pstates_info.version = 0x11c94;
Now we actually load “nvapi.dll” in our program’s memory space and retrieve the “nvapi_QueryInterface” export that will provide us with the pointers for all the other functions. We then call it sucessively with all the IDs we need and assign the result to our function pointers.
NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface"); NvInit = NvQueryInterface(0x0150E828); NvUnload = NvQueryInterface(0xD22BDD7E); NvEnumGPUs = NvQueryInterface(0xE5AC921F); NvGetSysType = NvQueryInterface(0xBAAABFCC); NvGetName = NvQueryInterface(0xCEEE8E9F); NvGetMemSize = NvQueryInterface(0x46FBEB03); NvGetMemType = NvQueryInterface(0x57F7CAAC); NvGetBiosName = NvQueryInterface(0xA561FD7D); NvGetFreq = NvQueryInterface(0xDCB616C3); NvGetPstates = NvQueryInterface(0x6FF81213); NvSetPstates = NvQueryInterface(0x0F4DAE6B);
We have all the required bits for our big plot so let’s just assemble the bricks together to get the information and data we need and display them in an ugly fashion.
NvInit(); NvEnumGPUs(hdlGPU, &nGPU); NvGetSysType(hdlGPU[0], &systype); NvGetName(hdlGPU[0], sysname); NvGetMemSize(hdlGPU[0], &memsize); NvGetMemType(hdlGPU[0], &memtype); NvGetBiosName(hdlGPU[0], biosname); NvGetPstates(hdlGPU[0], &pstates_info); switch(systype){ case 1: printf("\nType: Laptop\n"); break; case 2: printf("\nType: Desktop\n"); break; default: printf("\nType: Unknown\n"); break; } printf("Name: %s\n", sysname); printf("VRAM: %dMB GDDR%d\n", memsize/1024, memtype<=7?3:5); printf("BIOS: %s\n", biosname); printf("\nGPU: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000); printf("RAM: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000); printf("\nCurrent GPU OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000); printf("Current RAM OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000);
It should already be enough for a simple monitoring program like the well known GPUz but we can do better than that and get to the delicious megahertz… /omnomnomz.
Here’s for the GPU overclocking:
if(argc > 1){ userfreq = atoi(argv[1])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[10] = userfreq; NvSetPstates(hdlGPU[0], buf)? printf("\nGPU OC failed!\n") : printf("\nGPU OC OK: %d MHz\n", userfreq/1000); free(buf); } else { printf("\nGPU Frequency not in safe range (-250MHz to +250MHz).\n"); return 1; } }
And almost the same block of code for the VRAM overlocking:
if(argc > 2){ userfreq = atoi(argv[2])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2; NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!\n") : printf("RAM OC OK: %d MHz\n", userfreq/1000); free(buf); } else { printf("\nRAM Frequency not in safe range (-250MHz to +250MHz).\n"); return 1; } }
What it does is simple. Frequencies in the struct are expressed as KiloHertz, so we multiply by 1000 the frequency offset provided by the user trying not to ease an integer overflow that may induce a frying core. ;-)
No seriously, we “allocate” some “NV_GPU_PERF_PSTATES20_INFO_V1” yet again, but it’s a bloated pain and we only want a few values changed so we make it an empty buffer the same size of the struct.
We then fill the first int with the magic 0x11c94 version number. The 2nd and 3rd ints with 1, probably meaning we’ll provide only one Pstate profile containing only one Clock domain to the SetPstates20() function.
But if we do that… it works for the GPU but not for the VRAM, how do we overclock the damn VRAM?
At this point I was saying to myself “why can those guys get an overclock in their crappy soft and I cannot, that’s just unfair”. But this approach never gives any noteworthy result so I got back in IDA and diffed my struct with the struct that MSI Afterburner provides to the same call when overclocking the RAM.
And the trick was there before my eyes, the 7th int of the struct had changed from 0 to 4. This field is probably used as a flag with 0 being the GPU, 2 may or may not be the separate shaders clock domain for the previous GPU generations of that kind and 4 would then be the VRAM.
At last we cleanup the DLL and end our program:
NvUnload(); return 0;
What’s left to be done?
Testing our new toy obviously! We compile that thing and run it:
C:\>overclock.exe [+/- GPU MHz offset] [+/- RAM MHz offset]
Here we run two benchmarks using a basic MD5 bruteforcing so that we can be sure the modified clock speed is effective and we didn’t just change some funny numbers for display only.
First at stock frequency (950MHz) and then with a 100MHz underclocking (the dev machine is a laptop, I don’t intend to make it faster so substracting 100 will suffice).
And… it works! We’re done here guys.
I am not aware of any other open source implementation of such tool, it might only be a very simple C program in the end but it exists and the minimum required details for overclocking an Nvidia GPU programmatically are now public and in plain text.
This code, for what it’s worth, is free as in free beer: take it, polish it, make a LIGHT (not the usual stellar poop, we already have those) GUI for your own needs and enjoy.
Here’s a binary build of the program, rename it as .exe as wordpress.com will only let users upload media files:
https://1vwjbxf1wko0yhnr.files.wordpress.com/2015/08/overclock.jpg
Full code for your convenience, it should compile without warnings on pretty much everything and has no dependencies:
/* DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE Version 2, December 2004 Copyright (C) 2004 Sam Hocevar <sam@hocevar.net> Everyone is permitted to copy and distribute verbatim or modified copies of this license document, and changing it is allowed as long as the name is changed. DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. You just DO WHAT THE FUCK YOU WANT TO. */ #include <stdio.h> #include <stdlib.h> #include <windows.h> typedef unsigned long NvU32; typedef struct { NvU32 version; NvU32 ClockType:2; NvU32 reserved:22; NvU32 reserved1:8; struct { NvU32 bIsPresent:1; NvU32 reserved:31; NvU32 frequency; }domain[32]; } NV_GPU_CLOCK_FREQUENCIES_V2; typedef struct { int value; struct { int mindelta; int maxdelta; } valueRange; } NV_GPU_PERF_PSTATES20_PARAM_DELTA; typedef struct { NvU32 domainId; NvU32 typeId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PERF_PSTATES20_PARAM_DELTA freqDelta_kHz; union { struct { NvU32 freq_kHz; } single; struct { NvU32 minFreq_kHz; NvU32 maxFreq_kHz; NvU32 domainId; NvU32 minVoltage_uV; NvU32 maxVoltage_uV; } range; } data; } NV_GPU_PSTATE20_CLOCK_ENTRY_V1; typedef struct { NvU32 domainId; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 volt_uV; int voltDelta_uV; } NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1; typedef struct { NvU32 version; NvU32 bIsEditable:1; NvU32 reserved:31; NvU32 numPstates; NvU32 numClocks; NvU32 numBaseVoltages; struct { NvU32 pstateId; NvU32 bIsEditable:1; NvU32 reserved:31; NV_GPU_PSTATE20_CLOCK_ENTRY_V1 clocks[8]; NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1 baseVoltages[4]; } pstates[16]; } NV_GPU_PERF_PSTATES20_INFO_V1; typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset); typedef int (*NvAPI_Initialize_t)(); typedef int (*NvAPI_Unload_t)(); typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count); typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype); typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname); typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize); typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype); typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname); typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info); typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info); NvAPI_QueryInterface_t NvQueryInterface = 0; NvAPI_Initialize_t NvInit = 0; NvAPI_Unload_t NvUnload = 0; NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0; NvAPI_GPU_GetSystemType_t NvGetSysType = 0; NvAPI_GPU_GetFullName_t NvGetName = 0; NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0; NvAPI_GPU_GetRamType_t NvGetMemType = 0; NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0; NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0; NvAPI_GPU_GetPstates20_t NvGetPstates = 0; NvAPI_GPU_SetPstates20_t NvSetPstates = 0; int main(int argc, char **argv) { int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0; int *hdlGPU[64]={0}, *buf=0; char sysname[64]={0}, biosname[64]={0}; NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info; pstates_info.version = 0x11c94; NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface"); NvInit = NvQueryInterface(0x0150E828); NvUnload = NvQueryInterface(0xD22BDD7E); NvEnumGPUs = NvQueryInterface(0xE5AC921F); NvGetSysType = NvQueryInterface(0xBAAABFCC); NvGetName = NvQueryInterface(0xCEEE8E9F); NvGetMemSize = NvQueryInterface(0x46FBEB03); NvGetMemType = NvQueryInterface(0x57F7CAAC); NvGetBiosName = NvQueryInterface(0xA561FD7D); NvGetFreq = NvQueryInterface(0xDCB616C3); NvGetPstates = NvQueryInterface(0x6FF81213); NvSetPstates = NvQueryInterface(0x0F4DAE6B); NvInit(); NvEnumGPUs(hdlGPU, &nGPU); NvGetSysType(hdlGPU[0], &systype); NvGetName(hdlGPU[0], sysname); NvGetMemSize(hdlGPU[0], &memsize); NvGetMemType(hdlGPU[0], &memtype); NvGetBiosName(hdlGPU[0], biosname); NvGetPstates(hdlGPU[0], &pstates_info); switch(systype){ case 1: printf("\nType: Laptop\n"); break; case 2: printf("\nType: Desktop\n"); break; default: printf("\nType: Unknown\n"); break; } printf("Name: %s\n", sysname); printf("VRAM: %dMB GDDR%d\n", memsize/1024, memtype<=7?3:5); printf("BIOS: %s\n", biosname); printf("\nGPU: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000); printf("RAM: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000); printf("\nCurrent GPU OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000); printf("Current RAM OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000); if(argc > 1){ userfreq = atoi(argv[1])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[10] = userfreq; NvSetPstates(hdlGPU[0], buf)? printf("\nGPU OC failed!\n") : printf("\nGPU OC OK: %d MHz\n", userfreq/1000); free(buf); } else { printf("\nGPU Frequency not in safe range (-250MHz to +250MHz).\n"); return 1; } } if(argc > 2){ userfreq = atoi(argv[2])*1000; if(-250000 <= userfreq && userfreq <= 250000) { buf = malloc(0x1c94); memset(buf, 0, 0x1c94); buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1; buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2; NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!\n") : printf("RAM OC OK: %d MHz\n", userfreq/1000); free(buf); } else { printf("\nRAM Frequency not in safe range (-250MHz to +250MHz).\n"); return 1; } } NvUnload(); return 0; }
Now I have a stack of games to play with mind blowing framerate, hence this post is over.