We developed a machine learning model to evaluate the risk of relapse in lung cancer patients.

To build this model, we used mutational data from lung cancer patients, as described in Evaluate the probability of relapse in patients with lung cancer. We calculated the expression values as log2(expression + 1).

Patient’s data:

Pathological nodes:
Pathological tumour:
RAD51 expression:
ADGRF5 expression:
COCH expression:
SLC2A1 expression:
CLU expression:
ZDHHC7 expression:
LRFN4 expression:
AP2A2 expression:

Next, we divided risk levels using discrete statistics based on the probability of recurrence for each month. Specifically, values below the first quartile represent low risk, values between the first quartile and the median represent medium risk, and values above the median represent high risk. In the figure, the grey line indicates the median.

Additionally, the grey line shows the percentage of patients with recurrence for a given month. Therefore, if a patient’s curve remains below this line, that patient has a lower risk than the overall population at each time point.

We obtained the expression values from Affymetrix HG-U133A arrays and normalized them using the RMA (Robust Multiarray Averaging) method.

However, it is important to emphasize that no model can predict the future with certainty. For this reason, physicians must always interpret these predictions within the full clinical context before making a diagnosis.