journal article Open Access Feb 01, 2021

A Systematic Comparison of search-Based approaches for LDA hyperparameter tuning

View at Publisher Save 10.1016/j.infsof.2020.106411
Topics

No keywords indexed for this article. Browse by subject →

References
82
[1]
Antoniol "Information Retrieval models for recovering traceability links between code and documentation" (2000)
[2]
Nguyen "Duplicate bug report detection with a combination of Information Retrieval and Topic Modeling" (2012)
[3]
Sridhara "Towards automatically generating summary comments for Java methods" (2010)
[4]
Panichella "The impact of test case summaries on bug fixing performance: An empirical investigation" (2016)
[5]
De Lucia "Using IR methods for labeling source code artifacts: Is it worthwhile?" (2012)
[6]
Lee "Bench4bl: reproducibility study on the performance of IR-based bug localization" (2018)
[7]
Blei "Latent dirichlet allocation" The Journal of Machine Learning Research (2003)
[8]
Agrawal "What is wrong with topic modeling? and how to fix it using search-based software engineering" Inf. Softw. Technol. (2018) 10.1016/j.infsof.2018.02.005
[9]
Panichella "How to effectively use topic models for software engineering tasks? An approach based on genetic algorithms" (2013)
[10]
De Lucia "Labeling source code with information retrieval methods: an empirical study" Empirical Software Engineering (2014) 10.1007/s10664-013-9285-5
[11]
Finding scientific topics

Thomas L. Griffiths, Mark Steyvers

Proceedings of the National Academy of Sciences 2004 10.1073/pnas.0307752101
[12]
Hierarchical Dirichlet Processes

Yee Whye Teh, Michael I Jordan, Matthew J Beal et al.

Journal of the American Statistical Association 2006 10.1198/016214506000000302
[13]
Grant "Estimating the optimal number of latent concepts in source code analysis" (2010)
[14]
Arun "On finding the natural number of topics with Latent Dirichlet Allocation: Some observations" (2010)
[15]
Mimno "Optimizing semantic coherence in topic models" (2011)
[16]
Aletras "Evaluating topic coherence using distributional semantics" (2013)
[17]
Yarnguy "Tuning latent dirichlet allocation parameters using ant colony optimization" Journal of Telecommunication, Electronic and Computer Engineering (JTEC) (2018)
[18]
Onan "Biomedical text categorization based on ensemble pruning and optimized topic modelling" Comput. Math. Methods Med. (2018) 10.1155/2018/2497471
[19]
Panichella "A systematic comparison of search algorithms for topic modelling — A study on duplicate bug report identification" (2019)
[20]
Hindle "Preventing duplicate bug reports by continuously querying bug reports" Empirical Software Engineering (2019) 10.1007/s10664-018-9643-4
[21]
García "A study on the use of non-parametric tests for analyzing the evolutionary algorithms’ behaviour: a case study on the CEC’2005 special session on real parameter optimization" Journal of Heuristics (2009) 10.1007/s10732-008-9080-4
[22]
Japkowicz (2011)
[23]
Baker "Modern permutation test software" (1995)
[24]
Schütze (2008)
[25]
Enslen "Mining source code to automatically split identifiers for software analysis" (2009)
[26]
Willett "The porter stemming algorithm: then and now" Program (2006) 10.1108/00330330610681295
[27]
Baeza-Yates (1999)
[28]
Binkley "Information retrieval applications in software maintenance and evolution" Encyclopedia of Software Engineering (2009)
[29]
Capobianco "On the role of the nouns in IR-based traceability recovery" (2009)
[30]
Panichella "Parameterizing and assembling IR-based solutions for SE tasks using genetic algorithms" (2016)
[31]
Antoniol "Recovering code to documentation links in OO systems" (1999)
[32]
Antoniol "Recovering traceability links between code and documentation" IEEE Trans. Software Eng. (2002) 10.1109/tse.2002.1041053
[33]
Panichella "When and how using structural information to improve IR-based traceability recovery" (2013)
[34]
Capobianco "Improving IR-based traceability recovery via noun-based indexing of software artifacts" Journal of Software: Evolution and Process (2013)
[35]
Bettenburg "Duplicate bug reports considered harmful... really?" (2008)
[36]
Anvik "Coping with an open bug repository" (2005)
[37]
Anvik (2007)
[38]
Sun "A discriminative model approach for accurate duplicate bug report retrieval" (2010)
[39]
Tian "Improved duplicate bug report identification" (2012)
[40]
Minka "Expectation-propagation for the generative aspect model" (2002)
[41]
Wei "LDA-based document models for ad-hoc retrieval" (2006)
[42]
Porteous "Fast collapsed Gibbs sampling for Latent Dirichlet Allocation" (2008)
[43]
Bird (2015)
[44]
Hughes "Reliable and scalable variational inference for the hierarchical Dirichlet process" (2015)
[45]
Binkley "Source code analysis with LDA" Journal of Software: Evolution and Process (2016)
[46]
Mantyla "Measuring LDA topic stability from clusters of replicated runs" (2018)
[47]
Stevens "Exploring topic coherence over many models and many topics" (2012)
[49]
Campos "An empirical evaluation of evolutionary algorithms for unit test suite generation" Inf. Softw. Technol. (2018) 10.1016/j.infsof.2018.08.010
[50]
Bergstra "Random search for hyper-parameter optimization" Journal of Machine Learning Research (2012)

Showing 50 of 82 references

Metrics
55
Citations
82
References
Details
Published
Feb 01, 2021
Vol/Issue
130
Pages
106411
License
View
Funding
Technische Universiteit Delft
Cite This Article
Annibale Panichella (2021). A Systematic Comparison of search-Based approaches for LDA hyperparameter tuning. Information and Software Technology, 130, 106411. https://doi.org/10.1016/j.infsof.2020.106411
Related

You May Also Like