TEXTKNOCKOFF: KNOCKOFF NETS FOR STEALING FUNCTIONALITY OF TEXT SENTIMENT MODELS
DOI:
https://doi.org/10.56651/lqdtu.jst.v13.n01.821.ictKeywords:
Text classification, text sentiment, black-box model stealing, knockoff modelAbstract
Most commercial machine learning models today are designed to require significant amounts of time, money, and human effort. Therefore, intrinsic information about the model (such as architecture, hyperparameters, and training data) needs to be kept confidential. These models are referred to as black boxes, and there is an increasing amount of research focused on both attacking and protecting them. Recent publications have often concentrated on the field of computer vision; in contrast, there is still relatively little research on methods for attacking black box models with textual data. This article introduces a research method for extracting the functionality of a black box model in the task of text sentiment analysis. The method has been effectively tested based on random sampling techniques to reconstruct a new model with equivalent functionality to the original model, achieving high accuracy (94.46% compared to 94.92%) and high similarity (96.82%).