Overclocking tools for Nvidia GPUs suck, I made my own.

After spending most of the past decade without a decent computer, all laptops with GPUs more able at toasting bread than proper gaming, I finally cracked the spare Bitcoin piggy-bank and built my dream machine with an i7-4790k and Nvidia 970 GPU inside it.

I could play Witcher 3 at last, so many great games to catch up on. :-)

But before that I had to get the maximum performance the hardware can provide through overclocking.

The point is, I’m a nerd and nerds like to tweak things. We’re not the kind that puts up with bloated closed source software and crappy xmass tree GUIs. I thus needed a simple and snappy tool to achieve the purpose of overclocking my brand new GPU making it on par with a 980 model.

Let’s be honest here, the main contenders offenders would make the eyes of any sane person bleed instantly:

MSI-Afterburner

EVGA-Precision

gigabyte_oc-guru_oc

ASUS-GPU-Tweak-Monitor-GeForce-GTX660

Those tools are respectively from MSI, EVGA, Gigabyte and Asus.

A quick look at the interface and features suggests they are very similar, probably all built upon the same toolkit, it’s called “RTHAL” in MSI Afterburner.

Anyways this is bad software and their authors should feel bad. Not everyone buying those graphics cards is a 14yo xXX_l33thaxor1ny0ma|\/|4_XXx who wants dragons and giant robots on their packaging.

There is no light and open source overclocking software for power users these days, mostly because GPU makers won’t publish their docs, the situation needs a fix.

Where do we start?

Nvidia has an API to talk to their Driver, at least under Windows, it’s conveniently named NvAPI and it has a documentation here: GPU Performance State Interface.

What would hit the hopeful coder square in the face when reading that is the very reduced set of functions available:

NVAPI_INTERFACE 	NvAPI_GPU_GetPstates20 (__in NvPhysicalGpuHandle hPhysicalGpu, __inout NV_GPU_PERF_PSTATES20_INFO *pPstatesInfo)
NVAPI_INTERFACE 	NvAPI_GPU_GetCurrentPstate (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_PERF_PSTATE_ID *pCurrentPstate)
NVAPI_INTERFACE 	NvAPI_GPU_GetDynamicPstatesInfoEx (NvPhysicalGpuHandle hPhysicalGpu, NV_GPU_DYNAMIC_PSTATES_INFO_EX *pDynamicPstatesInfoEx)

Yep, that also sucks, plenty of functions with “Get” in their names but almost none with “Set”.

Indeed this public API is incomplete, after lurking the interwebs it seems the full featured api, headers, libs and docs are provided under an NDA and there is not the slightest chance I could access that information legitimately I suppose.

I don’t have time to waste jumping through those hoops, creating accounts and whatnot either, I just want to overclock my GPU and the Internet for once doesn’t have anything of that sort readily available since RivaTuner which has never been open source in the first place.

So let’s grab a shovel and go deeper.

 

What do we know, what do we need?

We sure know that tools from cards makers can do overclocking through NvAPI by accessing the undocumented functions.

Probably if there’s some “public” Get…

NvAPI_GPU_GetPstates20

then there’s a “private” Set… hiding somewhere:

NvAPI_GPU_SetPstates20

Maybe this is more complicated than that for all we know, so let’s start by running MSI Afterburner inside Ollydbg and we’ll quickly land here by browsing the strings references:

nvapiload

“nvapi.dll” definitely gets loaded here using LoadLibrary/GetModuleHandle. We’re on the right track.

Now where exactly is that lib used? There could be thousands occurrences.

That’s simple, with the program running and the realtime graph disabled (it polls NvAPI constantly adding noise to the mass of API calls). we place a memory breakpoint on the .Text memory segment of the NVapi.dll inside MSI Afterburner’s process. (just hit F2 in the segments window when NvAPI is highlighted…).

Then we set the sliders in the MSI tool to get some negligible GPU underclock and hit the “apply” button. It breaks inside NvAPI… magic!

nvapiqueryinterface1

But wait, this isn’t the “overclocking” (SetPstates20()) function there, the symbol for the return pointer on the top of the stack shows something along the lines of “QueryInterface”.

Long story short, this “NvAPI_QueryInterface” function is the only exported function from the nvapi.dll

nvapiexports

Its purpose is to take the ID of a function in the API and return a pointer to the actual code of the function in the mapped process. It probably serves as a convenient layer for not breaking the API across updates and also for obfuscating the entry points where the goods are to be found.

Actually if you get the NVapi SDK from Nvidia’s website you’ll find a linkable module inside the archive. It serves exactly no purpose, just acts as an “exports proxy”, it exports the name of all the public functions from the API, when the functions are called it retrieves the real pointer with the ID it holds and executes the real function.

Ultimately the end user/programmer doesn’t have to be aware of all those ID things, he would just call the public functions and link the module from the public SDK using the public headers.

You may already have guessed, I don’t want to proceed that way.

Hopefully if you look again at the previous screenshot of Ollydbg inside the QueryInterface() function you’ll find the sole INT argument to the function on top of the stack just under the return pointer, it’s 0xF4DAE6B. We’re getting closer!

Let’s continue runnning the program in olly and break a second time on NvAPI, we learn from the symbols floating around that MSI Afterburner just initiated a call to “Nv_SetPStates20()”. So 0xF4DAE6B is certainly the ID of the function we’re looking for.

Good we just need its prototype and arguments to be in turn able to declare and use it inside our own code.

Also a quick web search for 0xF4DAE6B yielded this very interesting result where an amazing Russian dude with only 2 messages on his stackoverflow.com profile still found a way to drop this sweet piece of data which looks disturbingly like what the NDA version of the API would be:

_NvAPI_Initialize   150E828h
_NvAPI_Unload   0D22BDD7Eh
_NvAPI_GetErrorMessage  6C2D048Ch
_NvAPI_GetInterfaceVersionString    1053FA5h
_NvAPI_GetDisplayDriverVersion  0F951A4D1h
_NvAPI_SYS_GetDriverAndBranchVersion    2926AAADh
_NvAPI_EnumNvidiaDisplayHandle  9ABDD40Dh
_NvAPI_EnumNvidiaUnAttachedDisplayHandle    20DE9260h
_NvAPI_EnumPhysicalGPUs 0E5AC921Fh
_NvAPI_EnumLogicalGPUs  48B3EA59h
_NvAPI_GetPhysicalGPUsFromDisplay   34EF9506h
_NvAPI_GetPhysicalGPUFromUnAttachedDisplay  5018ED61h
_NvAPI_CreateDisplayFromUnAttachedDisplay   63F9799Eh
_NvAPI_GetLogicalGPUFromDisplay 0EE1370CFh
_NvAPI_GetLogicalGPUFromPhysicalGPU 0ADD604D1h
_NvAPI_GetPhysicalGPUsFromLogicalGPU    0AEA3FA32h
_NvAPI_GetAssociatedNvidiaDisplayHandle 35C29134h
_NvAPI_DISP_GetAssociatedUnAttachedNvidiaDisplayHandle  0A70503B2h
_NvAPI_GetAssociatedNvidiaDisplayName   22A78B05h
_NvAPI_GetUnAttachedAssociatedDisplayName   4888D790h
_NvAPI_EnableHWCursor   2863148Dh
_NvAPI_DisableHWCursor  0AB163097h
_NvAPI_GetVBlankCounter 67B5DB55h
_NvAPI_SetRefreshRateOverride   3092AC32h
_NvAPI_GetAssociatedDisplayOutputId 0D995937Eh
_NvAPI_GetDisplayPortInfo   0C64FF367h
_NvAPI_SetDisplayPort   0FA13E65Ah
_NvAPI_GetHDMISupportInfo   6AE16EC3h
_NvAPI_DISP_EnumHDMIStereoModes 0D2CCF5D6h
_NvAPI_GetInfoFrame 9734F1Dh
_NvAPI_SetInfoFrame 69C6F365h
_NvAPI_SetInfoFrameState    67EFD887h
_NvAPI_GetInfoFrameState    41511594h
_NvAPI_Disp_InfoFrameControl    6067AF3Fh
_NvAPI_Disp_ColorControl    92F9D80Dh
_NvAPI_DISP_GetVirtualModeData  3230D69Ah
_NvAPI_DISP_OverrideDisplayModeList 291BFF2h
_NvAPI_GetDisplayDriverMemoryInfo   774AA982h
_NvAPI_GetDriverMemoryInfo  2DC95125h
_NvAPI_GetDVCInfo   4085DE45h
_NvAPI_SetDVCLevel  172409B4h
_NvAPI_GetDVCInfoEx 0E45002Dh
_NvAPI_SetDVCLevelEx    4A82C2B1h
_NvAPI_GetHUEInfo   95B64341h
_NvAPI_SetHUEAngle  0F5A0F22Ch
_NvAPI_GetImageSharpeningInfo   9FB063DFh
_NvAPI_SetImageSharpeningLevel  3FC9A59Ch
_NvAPI_D3D_GetCurrentSLIState   4B708B54h
_NvAPI_D3D9_RegisterResource    0A064BDFCh
_NvAPI_D3D9_UnregisterResource  0BB2B17AAh
_NvAPI_D3D9_AliasSurfaceAsTexture   0E5CEAE41h
_NvAPI_D3D9_StretchRectEx   22DE03AAh
_NvAPI_D3D9_ClearRT 332D3942h
_NvAPI_D3D_CreateQuery  5D19BCA4h
_NvAPI_D3D_DestroyQuery 0C8FF7258h
_NvAPI_D3D_Query_Begin  0E5A9AAE0h
_NvAPI_D3D_Query_End    2AC084FAh
_NvAPI_D3D_Query_GetData    0F8B53C69h
_NvAPI_D3D_Query_GetDataSize    0F2A54796h
_NvAPI_D3D_Query_GetType    4ACEEAF7h
_NvAPI_D3D_RegisterApp  0D44D3C4Eh
_NvAPI_D3D9_CreatePathContextNV 0A342F682h
_NvAPI_D3D9_DestroyPathContextNV    667C2929h
_NvAPI_D3D9_CreatePathNV    71329DF3h
_NvAPI_D3D9_DeletePathNV    73E0019Ah
_NvAPI_D3D9_PathVerticesNV  0C23DF926h
_NvAPI_D3D9_PathParameterfNV    0F7FF00C1h
_NvAPI_D3D9_PathParameteriNV    0FC31236Ch
_NvAPI_D3D9_PathMatrixNV    0D2F6C499h
_NvAPI_D3D9_PathDepthNV 0FCB16330h
_NvAPI_D3D9_PathClearDepthNV    157E45C4h
_NvAPI_D3D9_PathEnableDepthTestNV   0E99BA7F3h
_NvAPI_D3D9_PathEnableColorWriteNV  3E2804A2h
_NvAPI_D3D9_DrawPathNV  13199B3Dh
_NvAPI_D3D9_GetSurfaceHandle    0F2DD3F2h
_NvAPI_D3D9_GetOverlaySurfaceHandles    6800F5FCh
_NvAPI_D3D9_GetTextureHandle    0C7985ED5h
_NvAPI_D3D9_GpuSyncGetHandleSize    80C9FD3Bh
_NvAPI_D3D9_GpuSyncInit 6D6FDAD4h
_NvAPI_D3D9_GpuSyncEnd  754033F0h
_NvAPI_D3D9_GpuSyncMapTexBuffer 0CDE4A28Ah
_NvAPI_D3D9_GpuSyncMapSurfaceBuffer 2AB714ABh
_NvAPI_D3D9_GpuSyncMapVertexBuffer  0DBC803ECh
_NvAPI_D3D9_GpuSyncMapIndexBuffer   12EE68F2h
_NvAPI_D3D9_SetPitchSurfaceCreation 18CDF365h
_NvAPI_D3D9_GpuSyncAcquire  0D00B8317h
_NvAPI_D3D9_GpuSyncRelease  3D7A86BBh
_NvAPI_D3D9_GetCurrentRenderTargetHandle    22CAD61h
_NvAPI_D3D9_GetCurrentZBufferHandle 0B380F218h
_NvAPI_D3D9_GetIndexBufferHandle    0FC5A155Bh
_NvAPI_D3D9_GetVertexBufferHandle   72B19155h
_NvAPI_D3D9_CreateTexture   0D5E13573h
_NvAPI_D3D9_AliasPrimaryAsTexture   13C7112Eh
_NvAPI_D3D9_PresentSurfaceToDesktop 0F7029C5h
_NvAPI_D3D9_CreateVideoBegin    84C9D553h
_NvAPI_D3D9_CreateVideoEnd  0B476BF61h
_NvAPI_D3D9_CreateVideo 89FFD9A3h
_NvAPI_D3D9_FreeVideo   3111BED1h
_NvAPI_D3D9_PresentVideo    5CF7F862h
_NvAPI_D3D9_VideoSetStereoInfo  0B852F4DBh
_NvAPI_D3D9_SetGamutData    2BBDA32Eh
_NvAPI_D3D9_SetSurfaceCreationLayout    5609B86Ah
_NvAPI_D3D9_GetVideoCapabilities    3D596B93h
_NvAPI_D3D9_QueryVideoInfo  1E6634B3h
_NvAPI_D3D9_AliasPrimaryFromDevice  7C20C5BEh
_NvAPI_D3D9_SetResourceHint 905F5C27h
_NvAPI_D3D9_Lock    6317345Ch
_NvAPI_D3D9_Unlock  0C182027Eh
_NvAPI_D3D9_GetVideoState   0A4527BF8h
_NvAPI_D3D9_SetVideoState   0BD4BC56Fh
_NvAPI_D3D9_EnumVideoFeatures   1DB7C52Ch
_NvAPI_D3D9_GetSLIInfo  694BFF4Dh
_NvAPI_D3D9_SetSLIMode  0BFDC062Ch
_NvAPI_D3D9_QueryAAOverrideMode 0DDF5643Ch
_NvAPI_D3D9_VideoSurfaceEncryptionControl   9D2509EFh
_NvAPI_D3D9_DMA 962B8AF6h
_NvAPI_D3D9_EnableStereo    492A6954h
_NvAPI_D3D9_StretchRect 0AEAECD41h
_NvAPI_D3D9_CreateRenderTarget  0B3827C8h
_NvAPI_D3D9_NVFBC_GetStatus 0BD3EB475h
_NvAPI_D3D9_IFR_SetUpTargetBufferToSys  55255D05h
_NvAPI_D3D9_GPUBasedCPUSleep    0D504DDA7h
_NvAPI_D3D9_IFR_TransferRenderTarget    0AB7C2DCh
_NvAPI_D3D9_IFR_SetUpTargetBufferToNV12BLVideoSurface   0CFC92C15h
_NvAPI_D3D9_IFR_TransferRenderTargetToNV12BLVideoSurface    5FE72F64h
_NvAPI_D3D10_AliasPrimaryAsTexture  8AAC133Dh
_NvAPI_D3D10_SetPrimaryFlipChainCallbacks   73EB9329h
_NvAPI_D3D10_ProcessCallbacks   0AE9C2019h
_NvAPI_D3D10_GetRenderedCursorAsBitmap  0CAC3CE5Dh
_NvAPI_D3D10_BeginShareResource 35233210h
_NvAPI_D3D10_BeginShareResourceEx   0EF303A9Dh
_NvAPI_D3D10_EndShareResource   0E9C5853h
_NvAPI_D3D10_SetDepthBoundsTest 4EADF5D2h
_NvAPI_D3D10_CreateDevice   2DE11D61h
_NvAPI_D3D10_CreateDeviceAndSwapChain   5B803DAFh
_NvAPI_D3D11_CreateDevice   6A16D3A0h
_NvAPI_D3D11_CreateDeviceAndSwapChain   0BB939EE5h
_NvAPI_D3D11_BeginShareResource 121BDC6h
_NvAPI_D3D11_EndShareResource   8FFB8E26h
_NvAPI_D3D11_SetDepthBoundsTest 7AAF7A04h
_NvAPI_GPU_GetShaderPipeCount   63E2F56Fh
_NvAPI_GPU_GetShaderSubPipeCount    0BE17923h
_NvAPI_GPU_GetPartitionCount    86F05D7Ah
_NvAPI_GPU_GetMemPartitionMask  329D77CDh
_NvAPI_GPU_GetTPCMask   4A35DF54h
_NvAPI_GPU_GetSMMask    0EB7AF173h
_NvAPI_GPU_GetTotalTPCCount 4E2F76A8h
_NvAPI_GPU_GetTotalSMCount  0AE5FBCFEh
_NvAPI_GPU_GetTotalSPCount  0B6D62591h
_NvAPI_GPU_GetGpuCoreCount  0C7026A87h
_NvAPI_GPU_GetAllOutputs    7D554F8Eh
_NvAPI_GPU_GetConnectedOutputs  1730BFC9h
_NvAPI_GPU_GetConnectedSLIOutputs   680DE09h
_NvAPI_GPU_GetConnectedDisplayIds   78DBA2h
_NvAPI_GPU_GetAllDisplayIds 785210A2h
_NvAPI_GPU_GetConnectedOutputsWithLidState  0CF8CAF39h
_NvAPI_GPU_GetConnectedSLIOutputsWithLidState   96043CC7h
_NvAPI_GPU_GetSystemType    0BAAABFCCh
_NvAPI_GPU_GetActiveOutputs 0E3E89B6Fh
_NvAPI_GPU_GetEDID  37D32E69h
_NvAPI_GPU_SetEDID  0E83D6456h
_NvAPI_GPU_GetOutputType    40A505E4h
_NvAPI_GPU_GetDeviceDisplayMode 0D2277E3Ah
_NvAPI_GPU_GetFlatPanelInfo 36CFF969h
_NvAPI_GPU_ValidateOutputCombination    34C9C2D4h
_NvAPI_GPU_GetConnectorInfo 4ECA2C10h
_NvAPI_GPU_GetFullName  0CEEE8E9Fh
_NvAPI_GPU_GetPCIIdentifiers    2DDFB66Eh
_NvAPI_GPU_GetGPUType   0C33BAEB1h
_NvAPI_GPU_GetBusType   1BB18724h
_NvAPI_GPU_GetBusId 1BE0B8E5h
_NvAPI_GPU_GetBusSlotId 2A0A350Fh
_NvAPI_GPU_GetIRQ   0E4715417h
_NvAPI_GPU_GetVbiosRevision 0ACC3DA0Ah
_NvAPI_GPU_GetVbiosOEMRevision  2D43FB31h
_NvAPI_GPU_GetVbiosVersionString    0A561FD7Dh
_NvAPI_GPU_GetAGPAperture   6E042794h
_NvAPI_GPU_GetCurrentAGPRate    0C74925A0h
_NvAPI_GPU_GetCurrentPCIEDownstreamWidth    0D048C3B1h
_NvAPI_GPU_GetPhysicalFrameBufferSize   46FBEB03h
_NvAPI_GPU_GetVirtualFrameBufferSize    5A04B644h
_NvAPI_GPU_GetQuadroStatus  0E332FA47h
_NvAPI_GPU_GetBoardInfo 22D54523h
_NvAPI_GPU_GetRamType   57F7CAACh
_NvAPI_GPU_GetFBWidthAndLocation    11104158h
_NvAPI_GPU_GetAllClockFrequencies   0DCB616C3h
_NvAPI_GPU_GetPerfClocks    1EA54A3Bh
_NvAPI_GPU_SetPerfClocks    7BCF4ACh
_NvAPI_GPU_GetCoolerSettings    0DA141340h
_NvAPI_GPU_SetCoolerLevels  891FA0AEh
_NvAPI_GPU_RestoreCoolerSettings    8F6ED0FBh
_NvAPI_GPU_GetCoolerPolicyTable 518A32Ch
_NvAPI_GPU_SetCoolerPolicyTable 987947CDh
_NvAPI_GPU_RestoreCoolerPolicyTable 0D8C4FE63h
_NvAPI_GPU_GetPstatesInfo   0BA94C56Eh
_NvAPI_GPU_GetPstatesInfoEx 843C0256h
_NvAPI_GPU_SetPstatesInfo   0CDF27911h
_NvAPI_GPU_GetPstates20 6FF81213h
_NvAPI_GPU_SetPstates20 0F4DAE6Bh
_NvAPI_GPU_GetCurrentPstate 927DA4F6h
_NvAPI_GPU_GetPstateClientLimits    88C82104h
_NvAPI_GPU_SetPstateClientLimits    0FDFC7D49h
_NvAPI_GPU_EnableOverclockedPstates 0B23B70EEh
_NvAPI_GPU_EnableDynamicPstates 0FA579A0Fh
_NvAPI_GPU_GetDynamicPstatesInfoEx  60DED2EDh
_NvAPI_GPU_GetVoltages  7D656244h
_NvAPI_GPU_GetThermalSettings   0E3640A56h
_NvAPI_GPU_SetDitherControl 0DF0DFCDDh
_NvAPI_GPU_GetDitherControl 932AC8FBh
_NvAPI_GPU_GetColorSpaceConversion  8159E87Ah
_NvAPI_GPU_SetColorSpaceConversion  0FCABD23Ah
_NvAPI_GetTVOutputInfo  30C805D5h
_NvAPI_GetTVEncoderControls 5757474Ah
_NvAPI_SetTVEncoderControls 0CA36A3ABh
_NvAPI_GetTVOutputBorderColor   6DFD1C8Ch
_NvAPI_SetTVOutputBorderColor   0AED02700h
_NvAPI_GetDisplayPosition   6BB1EE5Dh
_NvAPI_SetDisplayPosition   57D9060Fh
_NvAPI_GetValidGpuTopologies    5DFAB48Ah
_NvAPI_GetInvalidGpuTopologies  15658BE6h
_NvAPI_SetGpuTopologies 25201F3Dh
_NvAPI_GPU_GetPerGpuTopologyStatus  0A81F8992h
_NvAPI_SYS_GetChipSetTopologyStatus 8A50F126h
_NvAPI_GPU_Get_DisplayPort_DongleInfo   76A70E8Dh
_NvAPI_I2CRead  2FDE12C5h
_NvAPI_I2CWrite 0E812EB07h
_NvAPI_I2CWriteEx   283AC65Ah
_NvAPI_I2CReadEx    4D7B0709h
_NvAPI_GPU_GetPowerMizerInfo    76BFA16Bh
_NvAPI_GPU_SetPowerMizerInfo    50016C78h
_NvAPI_GPU_GetVoltageDomainsStatus  0C16C7E2Ch
_NvAPI_GPU_ClientPowerTopologyGetInfo   0A4DFD3F2h
_NvAPI_GPU_ClientPowerTopologyGetStatus 0EDCF624Eh
_NvAPI_GPU_ClientPowerPoliciesGetInfo   34206D86h
_NvAPI_GPU_ClientPowerPoliciesGetStatus 70916171h
_NvAPI_GPU_ClientPowerPoliciesSetStatus 0AD95F5EDh
_NvAPI_GPU_WorkstationFeatureSetup  6C1F3FE4h
_NvAPI_SYS_GetChipSetInfo   53DABBCAh
_NvAPI_SYS_GetLidAndDockInfo    0CDA14D8Ah
_NvAPI_OGL_ExpertModeSet    3805EF7Ah
_NvAPI_OGL_ExpertModeGet    22ED9516h
_NvAPI_OGL_ExpertModeDefaultsSet    0B47A657Eh
_NvAPI_OGL_ExpertModeDefaultsGet    0AE921F12h
_NvAPI_SetDisplaySettings   0E04F3D86h
_NvAPI_GetDisplaySettings   0DC27D5D4h
_NvAPI_GetTiming    0AFC4833Eh
_NvAPI_DISP_GetMonitorCapabilities  3B05C7E1h
_NvAPI_EnumCustomDisplay    42892957h
_NvAPI_TryCustomDisplay 0BF6C1762h
_NvAPI_RevertCustomDisplayTrial 854BA405h
_NvAPI_DeleteCustomDisplay  0E7CB998Dh
_NvAPI_SaveCustomDisplay    0A9062C78h
_NvAPI_QueryUnderscanCap    61D7B624h
_NvAPI_EnumUnderscanConfig  4144111Ah
_NvAPI_DeleteUnderscanConfig    0F98854C8h
_NvAPI_SetUnderscanConfig   3EFADA1Dh
_NvAPI_GetDisplayFeatureConfig  8E985CCDh
_NvAPI_SetDisplayFeatureConfig  0F36A668Dh
_NvAPI_GetDisplayFeatureConfigDefaults  0F5F4D01h
_NvAPI_SetView  957D7B6h
_NvAPI_GetView  0D6B99D89h
_NvAPI_SetViewEx    6B89E68h
_NvAPI_GetViewEx    0DBBC0AF4h
_NvAPI_GetSupportedViews    66FB7FC0h
_NvAPI_GetHDCPLinkParameters    0B3BB0772h
_NvAPI_Disp_DpAuxChannelControl 8EB56969h
_NvAPI_SetHybridMode    0FB22D656h
_NvAPI_GetHybridMode    0E23B68C1h
_NvAPI_Coproc_GetCoprocStatus   1EFC3957h
_NvAPI_Coproc_SetCoprocInfoFlagsEx  0F4C863ACh
_NvAPI_Coproc_GetCoprocInfoFlagsEx  69A9874Dh
_NvAPI_Coproc_NotifyCoprocPowerState    0CADCB956h
_NvAPI_Coproc_GetApplicationCoprocInfo  79232685h
_NvAPI_GetVideoState    1C5659CDh
_NvAPI_SetVideoState    54FE75Ah
_NvAPI_SetFrameRateNotify   18919887h
_NvAPI_SetPVExtName 4FEEB498h
_NvAPI_GetPVExtName 2F5B08E0h
_NvAPI_SetPVExtProfile  8354A8F4h
_NvAPI_GetPVExtProfile  1B1B9A16h
_NvAPI_VideoSetStereoInfo   97063269h
_NvAPI_VideoGetStereoInfo   8E1F8CFEh
_NvAPI_Mosaic_GetSupportedTopoInfo  0FDB63C81h
_NvAPI_Mosaic_GetTopoGroup  0CB89381Dh
_NvAPI_Mosaic_GetOverlapLimits  989685F0h
_NvAPI_Mosaic_SetCurrentTopo    9B542831h
_NvAPI_Mosaic_GetCurrentTopo    0EC32944Eh
_NvAPI_Mosaic_EnableCurrentTopo 5F1AA66Ch
_NvAPI_Mosaic_SetGridTopology   3F113C77h
_NvAPI_Mosaic_GetMosaicCapabilities 0DA97071Eh
_NvAPI_Mosaic_GetDisplayCapabilities    0D58026B9h
_NvAPI_Mosaic_EnumGridTopologies    0A3C55220h
_NvAPI_Mosaic_GetDisplayViewportsByResolution   0DC6DC8D3h
_NvAPI_Mosaic_GetMosaicViewports    7EBA036h
_NvAPI_Mosaic_SetDisplayGrids   4D959A89h
_NvAPI_Mosaic_ValidateDisplayGridsWithSLI   1ECFD263h
_NvAPI_Mosaic_ValidateDisplayGrids  0CF43903Dh
_NvAPI_Mosaic_EnumDisplayModes  78DB97D7h
_NvAPI_Mosaic_ChooseGpuTopologies   0B033B140h
_NvAPI_Mosaic_EnumDisplayGrids  0DF2887AFh
_NvAPI_GetSupportedMosaicTopologies 410B5C25h
_NvAPI_GetCurrentMosaicTopology 0F60852BDh
_NvAPI_SetCurrentMosaicTopology 0D54B8989h
_NvAPI_EnableCurrentMosaicTopology  74073CC9h
_NvAPI_QueryNonMigratableApps   0BB9EF1C3h
_NvAPI_GPU_QueryActiveApps  65B1C5F5h
_NvAPI_Hybrid_QueryUnblockedNonMigratableApps   5F35BCB5h
_NvAPI_Hybrid_QueryBlockedMigratableApps    0F4C2F8CCh
_NvAPI_Hybrid_SetAppMigrationState  0FA0B9A59h
_NvAPI_Hybrid_IsAppMigrationStateChangeable 584CB0B6h
_NvAPI_GPU_GPIOQueryLegalPins   0FAB69565h
_NvAPI_GPU_GPIOReadFromPin  0F5E10439h
_NvAPI_GPU_GPIOWriteToPin   0F3B11E68h
_NvAPI_GPU_GetHDCPSupportStatus 0F089EEF5h
_NvAPI_SetTopologyFocusDisplayAndView   0A8064F9h
_NvAPI_Stereo_CreateConfigurationProfileRegistryKey 0BE7692ECh
_NvAPI_Stereo_DeleteConfigurationProfileRegistryKey 0F117B834h
_NvAPI_Stereo_SetConfigurationProfileValue  24409F48h
_NvAPI_Stereo_DeleteConfigurationProfileValue   49BCEECFh
_NvAPI_Stereo_Enable    239C4545h
_NvAPI_Stereo_Disable   2EC50C2Bh
_NvAPI_Stereo_IsEnabled 348FF8E1h
_NvAPI_Stereo_GetStereoCaps 0DFC063B7h
_NvAPI_Stereo_GetStereoSupport  296C434Dh
_NvAPI_Stereo_CreateHandleFromIUnknown  0AC7E37F4h
_NvAPI_Stereo_DestroyHandle 3A153134h
_NvAPI_Stereo_Activate  0F6A1AD68h
_NvAPI_Stereo_Deactivate    2D68DE96h
_NvAPI_Stereo_IsActivated   1FB0BC30h
_NvAPI_Stereo_GetSeparation 451F2134h
_NvAPI_Stereo_SetSeparation 5C069FA3h
_NvAPI_Stereo_DecreaseSeparation    0DA044458h
_NvAPI_Stereo_IncreaseSeparation    0C9A8ECECh
_NvAPI_Stereo_GetConvergence    4AB00934h
_NvAPI_Stereo_SetConvergence    3DD6B54Bh
_NvAPI_Stereo_DecreaseConvergence   4C87E317h
_NvAPI_Stereo_IncreaseConvergence   0A17DAABEh
_NvAPI_Stereo_GetFrustumAdjustMode  0E6839B43h
_NvAPI_Stereo_SetFrustumAdjustMode  7BE27FA2h
_NvAPI_Stereo_CaptureJpegImage  932CB140h
_NvAPI_Stereo_CapturePngImage   8B7E99B5h
_NvAPI_Stereo_ReverseStereoBlitControl  3CD58F89h
_NvAPI_Stereo_SetNotificationMessage    6B9B409Eh
_NvAPI_Stereo_SetActiveEye  96EEA9F8h
_NvAPI_Stereo_SetDriverMode 5E8F0BECh
_NvAPI_Stereo_GetEyeSeparation  0CE653127h
_NvAPI_Stereo_IsWindowedModeSupported   40C8ED5Eh
_NvAPI_Stereo_AppHandShake  8C610BDAh
_NvAPI_Stereo_HandShake_Trigger_Activation  0B30CD1A7h
_NvAPI_Stereo_HandShake_Message_Control 315E0EF0h
_NvAPI_Stereo_SetSurfaceCreationMode    0F5DCFCBAh
_NvAPI_Stereo_GetSurfaceCreationMode    36F1C736h
_NvAPI_Stereo_Debug_WasLastDrawStereoized   0ED4416C5h
_NvAPI_Stereo_ForceToScreenDepth    2D495758h
_NvAPI_Stereo_SetVertexShaderConstantF  416C07B3h
_NvAPI_Stereo_SetVertexShaderConstantB  5268716Fh
_NvAPI_Stereo_SetVertexShaderConstantI  7923BA0Eh
_NvAPI_Stereo_GetVertexShaderConstantF  622FDC87h
_NvAPI_Stereo_GetVertexShaderConstantB  712BAA5Bh
_NvAPI_Stereo_GetVertexShaderConstantI  5A60613Ah
_NvAPI_Stereo_SetPixelShaderConstantF   0A9657F32h
_NvAPI_Stereo_SetPixelShaderConstantB   0BA6109EEh
_NvAPI_Stereo_SetPixelShaderConstantI   912AC28Fh
_NvAPI_Stereo_GetPixelShaderConstantF   0D4974572h
_NvAPI_Stereo_GetPixelShaderConstantB   0C79333AEh
_NvAPI_Stereo_GetPixelShaderConstantI   0ECD8F8CFh
_NvAPI_Stereo_SetDefaultProfile 44F0ECD1h
_NvAPI_Stereo_GetDefaultProfile 624E21C2h
_NvAPI_Stereo_Is3DCursorSupported   0D7C9EC09h
_NvAPI_Stereo_GetCursorSeparation   72162B35h
_NvAPI_Stereo_SetCursorSeparation   0FBC08FC1h
_NvAPI_VIO_GetCapabilities  1DC91303h
_NvAPI_VIO_Open 44EE4841h
_NvAPI_VIO_Close    0D01BD237h
_NvAPI_VIO_Status   0E6CE4F1h
_NvAPI_VIO_SyncFormatDetect 118D48A3h
_NvAPI_VIO_GetConfig    0D34A789Bh
_NvAPI_VIO_SetConfig    0E4EEC07h
_NvAPI_VIO_SetCSC   0A1EC8D74h
_NvAPI_VIO_GetCSC   7B0D72A3h
_NvAPI_VIO_SetGamma 964BF452h
_NvAPI_VIO_GetGamma 51D53D06h
_NvAPI_VIO_SetSyncDelay 2697A8D1h
_NvAPI_VIO_GetSyncDelay 462214A9h
_NvAPI_VIO_GetPCIInfo   0B981D935h
_NvAPI_VIO_IsRunning    96BD040Eh
_NvAPI_VIO_Start    0CDE8E1A3h
_NvAPI_VIO_Stop 6BA2A5D6h
_NvAPI_VIO_IsFrameLockModeCompatible    7BF0A94Dh
_NvAPI_VIO_EnumDevices  0FD7C5557h
_NvAPI_VIO_QueryTopology    869534E2h
_NvAPI_VIO_EnumSignalFormats    0EAD72FE4h
_NvAPI_VIO_EnumDataFormats  221FA8E8h
_NvAPI_GPU_GetTachReading   5F608315h
_NvAPI_3D_GetProperty   8061A4B1h
_NvAPI_3D_SetProperty   0C9175E8Dh
_NvAPI_3D_GetPropertyRange  0B85DE27Ch
_NvAPI_GPS_GetPowerSteeringStatus   540EE82Eh
_NvAPI_GPS_SetPowerSteeringStatus   9723D3A2h
_NvAPI_GPS_SetVPStateCap    68888EB4h
_NvAPI_GPS_GetVPStateCap    71913023h
_NvAPI_GPS_GetThermalLimit  583113EDh
_NvAPI_GPS_SetThermalLimit  0C07E210Fh
_NvAPI_GPS_GetPerfSensors   271C1109h
_NvAPI_SYS_GetDisplayIdFromGpuAndOutputId   8F2BAB4h
_NvAPI_SYS_GetGpuAndOutputIdFromDisplayId   112BA1A5h
_NvAPI_DISP_GetDisplayIdByDisplayName   0AE457190h
_NvAPI_DISP_GetGDIPrimaryDisplayId  1E9D8A31h
_NvAPI_DISP_GetDisplayConfig    11ABCCF8h
_NvAPI_DISP_SetDisplayConfig    5D8CF8DEh
_NvAPI_GPU_GetPixelClockRange   66AF10B7h
_NvAPI_GPU_SetPixelClockRange   5AC7F8E5h
_NvAPI_GPU_GetECCStatusInfo 0CA1DDAF3h
_NvAPI_GPU_GetECCErrorInfo  0C71F85A6h
_NvAPI_GPU_ResetECCErrorInfo    0C02EEC20h
_NvAPI_GPU_GetECCConfigurationInfo  77A796F3h
_NvAPI_GPU_SetECCConfiguration  1CF639D9h
_NvAPI_D3D1x_CreateSwapChain    1BC21B66h
_NvAPI_D3D9_CreateSwapChain 1A131E09h
_NvAPI_D3D_SetFPSIndicatorState 0A776E8DBh
_NvAPI_D3D9_Present 5650BEBh
_NvAPI_D3D9_QueryFrameCount 9083E53Ah
_NvAPI_D3D9_ResetFrameCount 0FA6A0675h
_NvAPI_D3D9_QueryMaxSwapGroup   5995410Dh
_NvAPI_D3D9_QuerySwapGroup  0EBA4D232h
_NvAPI_D3D9_JoinSwapGroup   7D44BB54h
_NvAPI_D3D9_BindSwapBarrier 9C39C246h
_NvAPI_D3D1x_Present    3B845A1h
_NvAPI_D3D1x_QueryFrameCount    9152E055h
_NvAPI_D3D1x_ResetFrameCount    0FBBB031Ah
_NvAPI_D3D1x_QueryMaxSwapGroup  9BB9D68Fh
_NvAPI_D3D1x_QuerySwapGroup 407F67AAh
_NvAPI_D3D1x_JoinSwapGroup  14610CD7h
_NvAPI_D3D1x_BindSwapBarrier    9DE8C729h
_NvAPI_SYS_VenturaGetState  0CB7C208Dh
_NvAPI_SYS_VenturaSetState  0CE2E9D9h
_NvAPI_SYS_VenturaGetCoolingBudget  0C9D86E33h
_NvAPI_SYS_VenturaSetCoolingBudget  85FF5A15h
_NvAPI_SYS_VenturaGetPowerReading   63685979h
_NvAPI_DISP_GetDisplayBlankingState 63E5D8DBh
_NvAPI_DISP_SetDisplayBlankingState 1E17E29Bh
_NvAPI_DRS_CreateSession    694D52Eh
_NvAPI_DRS_DestroySession   0DAD9CFF8h
_NvAPI_DRS_LoadSettings 375DBD6Bh
_NvAPI_DRS_SaveSettings 0FCBC7E14h
_NvAPI_DRS_LoadSettingsFromFile 0D3EDE889h
_NvAPI_DRS_SaveSettingsToFile   2BE25DF8h
_NvAPI_DRS_CreateProfile    0CC176068h
_NvAPI_DRS_DeleteProfile    17093206h
_NvAPI_DRS_SetCurrentGlobalProfile  1C89C5DFh
_NvAPI_DRS_GetCurrentGlobalProfile  617BFF9Fh
_NvAPI_DRS_GetProfileInfo   61CD6FD6h
_NvAPI_DRS_SetProfileInfo   16ABD3A9h
_NvAPI_DRS_FindProfileByName    7E4A9A0Bh
_NvAPI_DRS_EnumProfiles 0BC371EE0h
_NvAPI_DRS_GetNumProfiles   1DAE4FBCh
_NvAPI_DRS_CreateApplication    4347A9DEh
_NvAPI_DRS_DeleteApplicationEx  0C5EA85A1h
_NvAPI_DRS_DeleteApplication    2C694BC6h
_NvAPI_DRS_GetApplicationInfo   0ED1F8C69h
_NvAPI_DRS_EnumApplications 7FA2173Ah
_NvAPI_DRS_FindApplicationByName    0EEE566B2h
_NvAPI_DRS_SetSetting   577DD202h
_NvAPI_DRS_GetSetting   73BF8338h
_NvAPI_DRS_EnumSettings 0AE3039DAh
_NvAPI_DRS_EnumAvailableSettingIds  0F020614Ah
_NvAPI_DRS_EnumAvailableSettingValues   2EC39F90h
_NvAPI_DRS_GetSettingIdFromName 0CB7309CDh
_NvAPI_DRS_GetSettingNameFromId 0D61CBE6Eh
_NvAPI_DRS_DeleteProfileSetting 0E4A26362h
_NvAPI_DRS_RestoreAllDefaults   5927B094h
_NvAPI_DRS_RestoreProfileDefault    0FA5F6134h
_NvAPI_DRS_RestoreProfileDefaultSetting 53F0381Eh
_NvAPI_DRS_GetBaseProfile   0DA8466A0h
_NvAPI_Event_RegisterCallback   0E6DBEA69h
_NvAPI_Event_UnregisterCallback 0DE1F9B45h
_NvAPI_GPU_GetCurrentThermalLevel   0D2488B79h
_NvAPI_GPU_GetCurrentFanSpeedLevel  0BD71F0C9h
_NvAPI_GPU_SetScanoutIntensity  0A57457A4h
_NvAPI_GPU_SetScanoutWarping    0B34BAB4Fh
_NvAPI_GPU_GetScanoutConfiguration  6A9F5B63h
_NvAPI_DISP_SetHCloneTopology   61041C24h
_NvAPI_DISP_GetHCloneTopology   47BAD137h
_NvAPI_DISP_ValidateHCloneTopology  5F4C2664h
_NvAPI_GPU_GetPerfDecreaseInfo  7F7F4600h
_NvAPI_GPU_QueryIlluminationSupport 0A629DA31h
_NvAPI_GPU_GetIllumination  9A1B9365h
_NvAPI_GPU_SetIllumination  254A187h
_NvAPI_D3D1x_IFR_SetUpTargetBufferToSys 473F7828h
_NvAPI_D3D1x_IFR_TransferRenderTarget   9FBAE4EBh

“_NvAPI_GPU_SetPstates20 0F4DAE6Bh”, it checks, we’re definitely heading the correct way!

In IDApro we can also check the Xrefs from the NvQueryInterface and we land at the start of the data section with a big array of INTs grouped by pairs, each comprises the address of a function and the associated Nvidia function ID:

nvapidllids

Then again, IDs and addresses are valid according to the information we already have.

It means we are now sure of the location of the location for “GetPstates20” and “SetPstates20”. We can break directly inside them at will. Let’s do that in IDA after importing the nvapi.h headers so IDA knows about the structs in use: grab the pointer for the second argument on the stack just when entering “GetPstates20”, dereference it and apply the type of an “NV_GPU_PERF_PSTATES20_INFO_V1” struct to it.

Now all those apparently garbage values are starting to make sense.

We can confirm everything is correct as we expected by comparing one of the values to an authoritative measurement. Here 0xD8ACC stands for the GPU vCore represented as µVolts. It is 887500 in base 10, meaning 887.5mV or 0.8875V. The GPU-Z tool reports a similar value.

It seems we’re doing fine:

itchecks

For good measure let’s go back a bit and put a conditional logging breakpoint in Olly at the beginning of the “QueryInterface” function in order to log ALL the function IDs successively requested by MSI Afterburner. Just in case things don’t go smoothly and we encounter a difficult pipeline setup before being able to overclock the GPU.

_NvAPI_Initialize            = 0150E828
718006A0   COND: offset            = 33C7358C
718006A0   COND: offset            = 593E8644
_NvAPI_SYS_GetDriverAndBranchVersion    = 2926AAAD
_NvAPI_EnumPhysicalGPUs            = E5AC921F
718006A0   COND: offset            = 6533EA3E
_NvAPI_GPU_GetBusId            = 1BE0B8E5
_NvAPI_GPU_GetBusSlotId            = 2A0A350F
_NvAPI_GPU_GetPCIIdentifiers        = 2DDFB66E
_NvAPI_DRS_EnumAvailableSettingIds    = F020614A
_NvAPI_DRS_CreateSession        = 0694D52E
_NvAPI_DRS_LoadSettings            = 375DBD6B
_NvAPI_DRS_GetBaseProfile        = DA8466A0
_NvAPI_DRS_GetSetting            = 73BF8338
_NvAPI_DRS_DestroySession        = DAD9CFF8
_NvAPI_GPU_ClientPowerTopologyGetStatus = EDCF624E
_NvAPI_GPU_GetThermalSettings        = E3640A56
_NvAPI_GPU_GetDynamicPstatesInfoEx    = 60DED2ED
_NvAPI_GPU_GetCoolerSettings        = DA141340
_NvAPI_GPU_GetAllClockFrequencies    = DCB616C3
_NvAPI_GPU_GetPstates20            = 6FF81213
718006A0   COND: offset            = 07F9B368
718006A0   COND: offset            = 409D9841
_NvAPI_GPU_ClientPowerPoliciesGetInfo    = 34206D86
718006A0   COND: offset            = 0D258BB5
_NvAPI_GPU_GetSystemType        = BAAABFCC
_NvAPI_GPU_GetFullName            = CEEE8E9F
718006A0   COND: offset            = 3D358A0C
718006A0   COND: offset            = D988F0F3
_NvAPI_GPU_GetVbiosVersionString    = A561FD7D
_NvAPI_GPU_SetPstates20            = 0F4DAE6B
_NvAPI_Unload                = D22BDD7E

Despite the stackoverflow post being somewhat outdated or incomplete we can still name the majority of the functions called.

Reading through that quickly shows the obvious things one would expect, init the NVapi, get various informations, finally call SetPstates20 (when we hit a breakpoint attempting some small underclock) and clean up the API.

Considering “GetVbiosVersionString” is called just before the overclocking related function and it’s a purely GUI/dashboard info feature we can safely assume that no particular setup is required, we just need the correct arguments to call said function.

Reversing the function’s arguments

This one was supposed to be hell but actually it went better than previous thought:

– Most NvAPI functions including GetPstates20 take a physical GPU handle as their first argument.

– GetPstates20’s second argument is a struct for storing Pstates and it is documented in the public NvAPI headers.

– A quick look at the code calling “SetPstates20” in MSI afterburner shows 2 pushed arguments before the call. The first one is “0x100” for both the GET and SET functions, it is the handle for our GPU#0.

It is then highly likely that “SetPstates20” will take the same kind of struct as “GetPstates20” for its second argument, with a few edited values.

Let’s get coding

First, let’s isolate the data structures we need from the NvAPI headers because there are far too many lines in that file and I’m lazy to the point that scrolling hurts my finger. Also those are mostly ints so we’ll get rid of all the fancy names and make them regular uint/int for readability.

typedef unsigned long NvU32;

typedef struct {
    NvU32   version;
    NvU32   ClockType:2;
    NvU32   reserved:22;
    NvU32   reserved1:8;
    struct {
        NvU32   bIsPresent:1;
        NvU32   reserved:31;
        NvU32   frequency;
    }domain[32];
} NV_GPU_CLOCK_FREQUENCIES_V2;

typedef struct {
    int value;
    struct {
        int   mindelta;
        int   maxdelta;
    } valueRange;
} NV_GPU_PERF_PSTATES20_PARAM_DELTA;

typedef struct {
    NvU32   domainId;
    NvU32   typeId;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NV_GPU_PERF_PSTATES20_PARAM_DELTA   freqDelta_kHz;
    union {
        struct {
            NvU32   freq_kHz;
        } single;
        struct {
            NvU32   minFreq_kHz;
            NvU32   maxFreq_kHz;
            NvU32   domainId;
            NvU32   minVoltage_uV;
            NvU32   maxVoltage_uV;
        } range;
    } data;
} NV_GPU_PSTATE20_CLOCK_ENTRY_V1;

typedef struct {
    NvU32   domainId;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NvU32   volt_uV;
    int     voltDelta_uV;
} NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1;

typedef struct {
    NvU32   version;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NvU32   numPstates;
    NvU32   numClocks;
    NvU32   numBaseVoltages;
    struct {
        NvU32                                   pstateId;
        NvU32                                   bIsEditable:1;
        NvU32                                   reserved:31;
        NV_GPU_PSTATE20_CLOCK_ENTRY_V1          clocks[8];
        NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1   baseVoltages[4];
    } pstates[16];
} NV_GPU_PERF_PSTATES20_INFO_V1;

 

Then we need prototypes for the functions we’ll use. Remember we won’t call the provided exports from the NvAPI lib inside the SDK but rather retrieve the function pointers directly from the running nvapi.dll and execute them as such.

A handful of convenient function prototypes to get some infos, retrieve clocks and setting them up. Some of them can be found inside the public API, the others are probably from the NDA version. We use the same techniques as mentionned earlier to get to know about them:

typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset);
typedef int (*NvAPI_Initialize_t)();
typedef int (*NvAPI_Unload_t)();
typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count);
typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype);
typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname);
typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize);
typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype);
typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname);
typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info);
typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info);
typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info);

NvAPI_QueryInterface_t NvQueryInterface = 0;
NvAPI_Initialize_t NvInit = 0;
NvAPI_Unload_t NvUnload = 0;
NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0;
NvAPI_GPU_GetSystemType_t NvGetSysType = 0;
NvAPI_GPU_GetFullName_t NvGetName = 0;
NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0;
NvAPI_GPU_GetRamType_t NvGetMemType = 0;
NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0;
NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0;
NvAPI_GPU_GetPstates20_t NvGetPstates = 0;
NvAPI_GPU_SetPstates20_t NvSetPstates = 0;

 

Time for the main() function, the code should be fairly short and this is just a PoC or whatever, brace yourself for screaming KNF nazis.

We’ll need those variables, they are of lesser importance, just the last line is a requirement.

“NV_GPU_PERF_PSTATES20_INFO_V1” is the root struct holding all the clocking and power data for the selected gpu handle. The size of this struct is 0x1c94, for some reason Nvidia decided to use that as the “version” field after adding 0x10000 to it so we set that field to 0x11c94 or the subsequent calls using the structure will return a cryptic error code.

int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0;
int *hdlGPU[64]={0}, *buf=0;
char sysname[64]={0}, biosname[64]={0};
NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info;
pstates_info.version = 0x11c94;

Now we actually load “nvapi.dll” in our program’s memory space and retrieve the “nvapi_QueryInterface” export that will provide us with the pointers for all the other functions. We then call it sucessively with all the IDs we need and assign the result to our function pointers.

NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface");
NvInit          = NvQueryInterface(0x0150E828);
NvUnload        = NvQueryInterface(0xD22BDD7E);
NvEnumGPUs      = NvQueryInterface(0xE5AC921F);
NvGetSysType    = NvQueryInterface(0xBAAABFCC);
NvGetName       = NvQueryInterface(0xCEEE8E9F);
NvGetMemSize    = NvQueryInterface(0x46FBEB03);
NvGetMemType    = NvQueryInterface(0x57F7CAAC);
NvGetBiosName   = NvQueryInterface(0xA561FD7D);
NvGetFreq       = NvQueryInterface(0xDCB616C3);
NvGetPstates    = NvQueryInterface(0x6FF81213);
NvSetPstates    = NvQueryInterface(0x0F4DAE6B);

We have all the required bits for our big plot so let’s just assemble the bricks together to get the information and data we need and display them in an ugly fashion.

NvInit();
NvEnumGPUs(hdlGPU, &nGPU);
NvGetSysType(hdlGPU[0], &systype);
NvGetName(hdlGPU[0], sysname);
NvGetMemSize(hdlGPU[0], &memsize);
NvGetMemType(hdlGPU[0], &memtype);
NvGetBiosName(hdlGPU[0], biosname);
NvGetPstates(hdlGPU[0], &pstates_info);

    switch(systype){
        case 1:     printf("\nType: Laptop\n"); break;
        case 2:     printf("\nType: Desktop\n"); break;
        default:    printf("\nType: Unknown\n"); break;
    }
    printf("Name: %s\n", sysname);
    printf("VRAM: %dMB GDDR%d\n", memsize/1024, memtype<=7?3:5);
    printf("BIOS: %s\n", biosname);
    printf("\nGPU: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000);
    printf("RAM: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000);
    printf("\nCurrent GPU OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000);
    printf("Current RAM OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000);

It should already be enough for a simple monitoring program like the well known GPUz but we can do better than that and get to the delicious megahertz… /omnomnomz.

Here’s for the GPU overclocking:

if(argc > 1){
    userfreq = atoi(argv[1])*1000;
    if(-250000 <= userfreq && userfreq <= 250000) {
        buf = malloc(0x1c94);
        memset(buf, 0, 0x1c94);
        buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1;
        buf[10] = userfreq;
        NvSetPstates(hdlGPU[0], buf)? printf("\nGPU OC failed!\n") : printf("\nGPU OC OK: %d MHz\n", userfreq/1000);
        free(buf);
    } else {
        printf("\nGPU Frequency not in safe range (-250MHz to +250MHz).\n");
        return 1;
    }   }

And almost the same block of code for the VRAM overlocking:

if(argc > 2){
    userfreq = atoi(argv[2])*1000;
    if(-250000 <= userfreq && userfreq <= 250000) {
        buf = malloc(0x1c94);
        memset(buf, 0, 0x1c94);
        buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1;
        buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2;
        NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!\n") : printf("RAM OC OK: %d MHz\n", userfreq/1000);
        free(buf);
    } else {
        printf("\nRAM Frequency not in safe range (-250MHz to +250MHz).\n");
        return 1;
}   }

What it does is simple. Frequencies in the struct are expressed as KiloHertz, so we multiply by 1000 the frequency offset provided by the user trying not to ease an integer overflow that may induce a frying core. ;-)

No seriously, we “allocate” some “NV_GPU_PERF_PSTATES20_INFO_V1” yet again, but it’s a bloated pain and we only want a few values changed so we make it an empty buffer the same size of the struct.

We then fill the first int with the magic 0x11c94 version number. The 2nd and 3rd ints with 1, probably meaning we’ll provide only one Pstate profile containing only one Clock domain to the SetPstates20() function.

But if we do that… it works for the GPU but not for the VRAM, how do we overclock the damn VRAM?

At this point I was saying to myself “why can those guys get an overclock in their crappy soft and I cannot, that’s just unfair”. But this approach never gives any noteworthy result so I got back in IDA and diffed my struct with the struct that MSI Afterburner provides to the same call when overclocking the RAM.

And the trick was there before my eyes, the 7th int of the struct had changed from 0 to 4. This field is probably used as a flag with 0 being the GPU, 2 may or may not be the separate shaders clock domain for the previous GPU generations of that kind and 4 would then be the VRAM.

At last we cleanup the DLL and end our program:

NvUnload();
return 0;

What’s left to be done?

Testing our new toy obviously! We compile that thing and run it:

C:\>overclock.exe    [+/- GPU MHz offset]    [+/- RAM MHz offset]

overclock

Here we run two benchmarks using a basic MD5 bruteforcing so that we can be sure the modified clock speed is effective and we didn’t just change some funny numbers for display only.

First at stock frequency (950MHz) and then with a 100MHz underclocking (the dev machine is a laptop, I don’t intend to make it faster so substracting 100 will suffice).

hioc

lowoc

And… it works! We’re done here guys.

I am not aware of any other open source implementation of such tool, it might only be a very simple C program in the end but it exists and the minimum required details for overclocking an Nvidia GPU programmatically are now public and in plain text.

This code, for what it’s worth, is free as in free beer: take it, polish it, make a LIGHT (not the usual stellar poop, we already have those) GUI for your own needs and enjoy.

 

Here’s a binary build of the program, rename it as .exe as wordpress.com will only let users upload media files:

https://1vwjbxf1wko0yhnr.files.wordpress.com/2015/08/overclock.jpg

Full code for your convenience, it should compile without warnings on pretty much everything and has no dependencies:

 

/*
            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
                    Version 2, December 2004

 Copyright (C) 2004 Sam Hocevar <sam@hocevar.net>

 Everyone is permitted to copy and distribute verbatim or modified
 copies of this license document, and changing it is allowed as long
 as the name is changed.

            DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
   TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION

  0. You just DO WHAT THE FUCK YOU WANT TO.
*/

#include <stdio.h>
#include <stdlib.h>
#include <windows.h>

typedef unsigned long NvU32;

typedef struct {
    NvU32   version;
    NvU32   ClockType:2;
    NvU32   reserved:22;
    NvU32   reserved1:8;
    struct {
        NvU32   bIsPresent:1;
        NvU32   reserved:31;
        NvU32   frequency;
    }domain[32];
} NV_GPU_CLOCK_FREQUENCIES_V2;

typedef struct {
    int value;
    struct {
        int   mindelta;
        int   maxdelta;
    } valueRange;
} NV_GPU_PERF_PSTATES20_PARAM_DELTA;

typedef struct {
    NvU32   domainId;
    NvU32   typeId;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NV_GPU_PERF_PSTATES20_PARAM_DELTA   freqDelta_kHz;
    union {
        struct {
            NvU32   freq_kHz;
        } single;
        struct {
            NvU32   minFreq_kHz;
            NvU32   maxFreq_kHz;
            NvU32   domainId;
            NvU32   minVoltage_uV;
            NvU32   maxVoltage_uV;
        } range;
    } data;
} NV_GPU_PSTATE20_CLOCK_ENTRY_V1;

typedef struct {
    NvU32   domainId;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NvU32   volt_uV;
    int     voltDelta_uV;
} NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1;

typedef struct {
    NvU32   version;
    NvU32   bIsEditable:1;
    NvU32   reserved:31;
    NvU32   numPstates;
    NvU32   numClocks;
    NvU32   numBaseVoltages;
    struct {
        NvU32                                   pstateId;
        NvU32                                   bIsEditable:1;
        NvU32                                   reserved:31;
        NV_GPU_PSTATE20_CLOCK_ENTRY_V1          clocks[8];
        NV_GPU_PSTATE20_BASE_VOLTAGE_ENTRY_V1   baseVoltages[4];
    } pstates[16];
} NV_GPU_PERF_PSTATES20_INFO_V1;

typedef void *(*NvAPI_QueryInterface_t)(unsigned int offset);
typedef int (*NvAPI_Initialize_t)();
typedef int (*NvAPI_Unload_t)();
typedef int (*NvAPI_EnumPhysicalGPUs_t)(int **handles, int *count);
typedef int (*NvAPI_GPU_GetSystemType_t)(int *handle, int *systype);
typedef int (*NvAPI_GPU_GetFullName_t)(int *handle, char *sysname);
typedef int (*NvAPI_GPU_GetPhysicalFrameBufferSize_t)(int *handle, int *memsize);
typedef int (*NvAPI_GPU_GetRamType_t)(int *handle, int *memtype);
typedef int (*NvAPI_GPU_GetVbiosVersionString_t)(int *handle, char *biosname);
typedef int (*NvAPI_GPU_GetAllClockFrequencies_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info);
typedef int (*NvAPI_GPU_GetPstates20_t)(int *handle, NV_GPU_PERF_PSTATES20_INFO_V1 *pstates_info);
typedef int (*NvAPI_GPU_SetPstates20_t)(int *handle, int *pstates_info);

NvAPI_QueryInterface_t NvQueryInterface = 0;
NvAPI_Initialize_t NvInit = 0;
NvAPI_Unload_t NvUnload = 0;
NvAPI_EnumPhysicalGPUs_t NvEnumGPUs = 0;
NvAPI_GPU_GetSystemType_t NvGetSysType = 0;
NvAPI_GPU_GetFullName_t NvGetName = 0;
NvAPI_GPU_GetPhysicalFrameBufferSize_t NvGetMemSize = 0;
NvAPI_GPU_GetRamType_t NvGetMemType = 0;
NvAPI_GPU_GetVbiosVersionString_t NvGetBiosName = 0;
NvAPI_GPU_GetAllClockFrequencies_t NvGetFreq = 0;
NvAPI_GPU_GetPstates20_t NvGetPstates = 0;
NvAPI_GPU_SetPstates20_t NvSetPstates = 0;

int main(int argc, char **argv)
{
    int nGPU=0, userfreq = 0, systype=0, memsize=0, memtype=0;
    int *hdlGPU[64]={0}, *buf=0;
    char sysname[64]={0}, biosname[64]={0};
    NV_GPU_PERF_PSTATES20_INFO_V1 pstates_info;
    pstates_info.version = 0x11c94;

    NvQueryInterface = (void*)GetProcAddress(LoadLibrary("nvapi.dll"), "nvapi_QueryInterface");
    NvInit          = NvQueryInterface(0x0150E828);
    NvUnload        = NvQueryInterface(0xD22BDD7E);
    NvEnumGPUs      = NvQueryInterface(0xE5AC921F);
    NvGetSysType    = NvQueryInterface(0xBAAABFCC);
    NvGetName       = NvQueryInterface(0xCEEE8E9F);
    NvGetMemSize    = NvQueryInterface(0x46FBEB03);
    NvGetMemType    = NvQueryInterface(0x57F7CAAC);
    NvGetBiosName   = NvQueryInterface(0xA561FD7D);
    NvGetFreq       = NvQueryInterface(0xDCB616C3);
    NvGetPstates    = NvQueryInterface(0x6FF81213);
    NvSetPstates    = NvQueryInterface(0x0F4DAE6B);

    NvInit();
    NvEnumGPUs(hdlGPU, &nGPU);
    NvGetSysType(hdlGPU[0], &systype);
    NvGetName(hdlGPU[0], sysname);
    NvGetMemSize(hdlGPU[0], &memsize);
    NvGetMemType(hdlGPU[0], &memtype);
    NvGetBiosName(hdlGPU[0], biosname);
    NvGetPstates(hdlGPU[0], &pstates_info);

    switch(systype){
        case 1:     printf("\nType: Laptop\n"); break;
        case 2:     printf("\nType: Desktop\n"); break;
        default:    printf("\nType: Unknown\n"); break;
    }
    printf("Name: %s\n", sysname);
    printf("VRAM: %dMB GDDR%d\n", memsize/1024, memtype<=7?3:5);
    printf("BIOS: %s\n", biosname);
    printf("\nGPU: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).data.range.maxFreq_kHz)/1000);
    printf("RAM: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).data.single.freq_kHz)/1000);
    printf("\nCurrent GPU OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[0]).freqDelta_kHz.value)/1000);
    printf("Current RAM OC: %dMHz\n", (int)((pstates_info.pstates[0].clocks[1]).freqDelta_kHz.value)/1000);

    if(argc > 1){
        userfreq = atoi(argv[1])*1000;
        if(-250000 <= userfreq && userfreq <= 250000) {
            buf = malloc(0x1c94);
            memset(buf, 0, 0x1c94);
            buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1;
            buf[10] = userfreq;
            NvSetPstates(hdlGPU[0], buf)? printf("\nGPU OC failed!\n") : printf("\nGPU OC OK: %d MHz\n", userfreq/1000);
            free(buf);
        } else {
            printf("\nGPU Frequency not in safe range (-250MHz to +250MHz).\n");
            return 1;
        }   }
    if(argc > 2){
        userfreq = atoi(argv[2])*1000;
        if(-250000 <= userfreq && userfreq <= 250000) {
            buf = malloc(0x1c94);
            memset(buf, 0, 0x1c94);
            buf[0] = 0x11c94; buf[2] = 1; buf[3] = 1;
            buf[7] = 4; buf[10] = memtype<=7?userfreq:userfreq*2;
            NvSetPstates(hdlGPU[0], buf)? printf("VRAM OC failed!\n") : printf("RAM OC OK: %d MHz\n", userfreq/1000);
            free(buf);
        } else {
            printf("\nRAM Frequency not in safe range (-250MHz to +250MHz).\n");
            return 1;
    }   }
    NvUnload();
    return 0;
}

 

Now I have a stack of games to play with mind blowing framerate, hence this post is over.

Advertisements
Overclocking tools for Nvidia GPUs suck, I made my own.