Abstract:
This paper proposes a methodology to predict the repurchase intention based on the reviews and the customer’s
stated intention. However, there is a large number of words in the reviews. Using those words as features in the prediction model tends to decrease the accuracy of the model and cause model overfitting. A methodology that is based on Genetic Algorithm is proposed to improve the selection iteratively. Each chromosome is encoded as a set of randomly selected indices of words in the vocabulary. The fitness of a chromosome is
measured as the accuracy of the Decision Tree prediction model using the selected features (i.e., words). Decision Tree model also provides the feature importance values, which are used to rearrange the genes, such that the Crossover procedure ensures important genes are passed to the offspring. For the Mutation, the information about the Tendency Rank of the features is used alter a gene. Therefore, the Crossover and Mutation procedures are not merely combining and modifying the chromosomes. The proposed methodology is implemented to two data sets. For both data sets, the prediction accuracy of the proposed methodology is significantly higher than the baseline, i.e., random selection.
Description:
Makalah dipresentasikan pada 2021 International Conference on Innovation and Intelligence for Informatics, Computing, and TEchnologies (3ICT). p. 1-6.