Client receives anywhere from 50 to over a 100 leads everyday from various channels. Sifting through and finding leads with potential is challenging. Documents within the RFP/RFQ package are often 25 pages or more. Since not all leads can be thoroughly evaluated or seen in a timely manner, client is missing opportunities.
In order to tackle the issue we have built a model based on the key score values. We tried many supervised machine learning approaches, but they did not yield good results. The reason that supervised machine learning does not work is that there is no objective definition of a target (label) to train a supervised model: not all bid-on proposals (those in PRJ-xxx folders) will win, and some random proposals (collected from bidding sites) may also have chances to win. So it is arbitrary to define an objective label for a machine learning model. Instead, we used keyword matching and fuzzy keyword matching in Natural Language Processing (NLP) for this use case.
The keyword score based method can effectively assign higher scores for bid-on proposals and lower scores for random proposals. For random proposals, 90.41% of them have score < 5.