LeToR using linear regression

  • The purpose of this project is to use machine learning to solve the Learning to Rank (LeToR) problem.
  • We use two linear regression techniques to solve this problem of ranking. The tasks given for this project are:


Learning to Rank(Letor):

In this project, we use the ‘QueryLevelNorm’ version of LeToR 4.0 package provided by Microsoft. This dataset contains 69923 vectors of 46 dimensions each. Each dimension corresponds to a feature namely IDF of terms in the body, anchor, title, etc. of documents.

Data Partition:

80% of the total data was used as the training dataset, 10% of the data was used as the validating dataset and the remaining 10% was used to test the accuracy of the model.
5699 vectors were used for training.
6962 vectors were used for validation.
6962 vectors were used for testing.

Experimentation:

For the purpose of this project, some hyperparameters were chosen to observe the change in accuracy of the model for different values of the hyperparameters. The hyperparameters that were experimented with are:

  1. Number of Basis Functions (M)
  2. Regularization Factor (λ)
  3. Learning Rate (η)

Find the results of the experiments here: