1. Who can participate in the contest?
The contest is open to everyone. However, participants who are not students would not be eligible for prizes.
2. What programming languages can be used?
Any programming language of your choice can be used.
3. For the train data and the test data, will we be given only question text or question features as well?
You will be provided with both the raw question text and feature vectors for each question. You do not necessarily have to work on feature vectors provided by us. You can use any feature extraction algorithm of your choice on the question text provided to you.
4. Can you suggest some links for learning Machine learning? It will be very helpful.
Please visit the event website.
5. Do I need to come to IISc to participate in the contest?
No, this is an online event; you only need to register online.
6. Can a team have more than one member?
No, this is an individual event.
7. Can we use codes that are available online?
Yes, you can use any resource that is publicly available. Please cite them properly when you send us a report on your approach.
8. Can we use training data from an external source in addition to the one provided?
No, to ensure fairness, this is strictly disallowed and will lead to disqualification.
9. Can use algorithms already published in papers?
Yes, you can use algorithms published in papers. Please cite them properly when you send us a report on your approach.
10. Can we use GUI-based tools like Weka, RapidMiner etc? If so, how can we submit the code?
You can of course use these tools. In that case you will not be able to submit the code obviously, but you should write in your report the sequence of steps you took on the GUI to arrive at your result. As there are some points for the code also, you will lose out on those points. Therefore, even if you are using weka, try to call the libraries of Weka using code rather than using GUI.
11. Can we use tools for part-of-speech tagging, sentiment analysis etc?
You can use any tool you want. However these should be documented properly when you send us a report on your approach.
12. Can we use information other than the question text and the label while learning a model?
You can use external data sources in the training phase (while learning a model); for example, you can use a publicly available list of stop words or an external data source to create features that might be useful for this task but you should not use additional training examples. However, you have to make predictions on the test data using only the learned model; you are not allowed to use an external means of making predictions on the validation / test set such as crawling the web or manually labelling examples, this will lead to disqualification.
13. Are we allowed to use a publicly available list of stop words?
Yes, but do document this in the report that you would send us at the end.
14. What do the scores in the leaderboard tell
The leaderboard contains the best validation set accuracy (fraction of validation questions classified correctly) attained by each participant (sorted in descending order). This does not include test set accuracy and is only for your own evaluation. The final evaluation will be done on a test set.