The results of eliciting risk preferences are highly dependent on the elicitation method used. This raises the question of how risk preferences can be reliably elicited. Using item response theory (IRT), the results of four elicitation methods describing common latent variables identified as risk preferences are combined into a composite score. The responses of 9235 individuals to a dedicated survey indicate that the composite score is a more accurate estimation of latent risk preferences than the results of individual methods, substantially reducing measurement noise and method-specific biases. IRT improves accuracy by allowing variable weighting to be dependent on the most relevant range of each method in estimating latent risk preferences. Therefore, the composite score contains more information about latent risk preferences than the results of either factor-weighted or unweighted methods. Manipulating the specific amounts, order and starting point of the multiple price list method shows that the accuracy of this method is not impervious to framing effects. Combining simpler methods with more advanced methods, which are all framed closely to the relevant situation, yields a more accurate and more robust estimation of latent risk preferences.