Response Speed Analysis of Interactive Voicebot Receptionist
Sari
ABSTRAK
Pusat Penelitian Teh dan Kina (PPTK) Gambung menghadapi tantangan dalam memberikan informasi cepat kepada pengunjung. Untuk mengatasinya, dikembangkan voicebot interaktif berbasis teknologi Python Speech Recognition dan Pyttsx3, yang menggunakan metode speech-to-text dan text-to-speech. Pengujian dilakukan dengan variasi kondisi internet, intensitas kebisingan, perbedaan aksen, dan analisis keluaran suara. Hasil menunjukkan akurasi hingga 90% dengan rata-rata kecepatan tanggap 1,94 detik pada koneksi internet stabil dan suara yang jelas. Di lingkungan bising dengan kekuatan suara tinggi (105 dB), voicebot tetap mampu menanggapi. Voicebot ini juga menunjukkan akurasi yang sama (90%) untuk penutur asli dan non-asli bahasa Inggris. Solusi ini berpotensi meningkatkan aksesibilitas informasi di PPTK Gambung, meskipun kinerjanya dipengaruhi oleh kestabilan internet dan kondisi lingkungan.
Kata kunci: voicebot, voice-to-text, text-to-speech, pyttsx3, speech recognition
ABSTRACT
The Gambung Tea and Cinchona Research Center (PPTK) faces challenges in providing timely information to visitors. To address this, an interactive voicebot based on Python Speech Recognition technology and Pyttsx3 was developed, utilizing speech-to-text and text-to-speech methods. Tests were conducted with variations in internet conditions, noise intensity, accent differences, and voice output analysis. Results show accuracy of up to 90% with an average response speed of 1,94 seconds on a stable internet connection and clear voice. In noisy environments with high voice strength (105 dB), the voicebot was still able to respond. The voicebot also showed similar accuracy (90%) for native and nonnative English speakers. This solution has the potential to improve information accessibility at PPTK Gambung, although its performance is affected by internet stability and environmental conditions.
Keywords: voicebot, voice-to-text, text-to-speech, pyttsx3, speech recognition
Kata Kunci
Teks Lengkap:
PDF (English)Referensi
Alqadasi, A. M. A., Zeki, A. M., Sunar, M. S., Salam, M. S. B. H., Abdulghafor, R., & Khaled, N.A. (2024). Improving Automatic Forced Alignment for Phoneme Segmentation in Quranic Recitation. IEEE Access, 12, 229–244.
Bhat, N. M. (2024, September 27). pyttsx3 2.98. Retrieved from
https://pypi.org/project/pyttsx3/
Chen, L., Mao, X., & Yan, H. (2016). Text-Independent Phoneme Segmentation Combining EGG and Speech Data. IEEE/ACM Transactions on Audio Speech and Language Processing, 24(6), 1029–1037.
Gonzales, M. G., Corcoran, P., Harte, N., & Schukat, M. (2024). Joint Speech-Text Embeddings for Multitask Speech Processing. IEEE Access, 12, 145955–145967.
Hameed, I. A. (2016). Using natural language processing (NLP) for designing socially intelligent robots. 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), (pp. 268–269).
Ionescu, T. B., & Schlund, S. (2020). Programming cobots by voice: A human-centered, webbased approach. Procedia CIRP, 97, 123–129.
Kamble, A., Ghare, P. H., & Kumar, V. (2023). Deep-Learning-Based BCI for Automatic Imagined Speech Recognition Using SPWVD. IEEE Transactions on Instrumentation and Measurement, 72, 1–10
Kenitar, S. B., Arioua, M., & Yahyaoui, M. (2023). A Novel Approach of Latency and Energy Efficiency Analysis of IIoT With SQL and NoSQL Databases Communication. IEEE Access, 11, 129247–129257.
Marinati, R., Coelho, R., & Zao, L. (2024). FRS: Adaptive Score for Improving Acoustic Source Classification from Noisy Signals. IEEE Signal Processing Letters, 31, 671–675.
Naufal, N., Nurkhodijah, S., Anugrah, G. B., Pratama, A., Rabbani, M. I., Dilla, F. A., Anggraeni, T. N., & Firmansyah, R. (2022). Analisa Perbandingan Kinerja Response Time Query Mysql Dan Mongodb. Jurnal Informatika Dan Teknologi Komputer, 2(2), 158–166.
Park, D. E., Lee, J., Han, J., Kim, J., & Shin, Y. J. (2024). A Preliminary Study of Voicebot to Assist ADHD Children in Performing Daily Tasks. International Journal of Human–Computer Interaction, 40(10), 2711–2724.
Saeki, T., Takamichi, S., & Saruwatari, H. (2021). Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model. IEEE Signal Processing Letters, 28, 857–861.
Swathi, B. P., Geetha, M., Attigeri, G., Suhas, M. V., & Halaharvi, S. (2024). Optimizing Question Answering Systems in Education: Addressing Domain-Specific Challenges. IEEE Access, 12, 156572–156587.
Valerie, M., Salamah, I., & Lindawati. (2023). Innovative Personal Assistance: Speech Recognition and NLP-Driven Robot Prototype. Jurnal Nasional Teknik Elektro, 12(2), 181–187.
Zhang, W., Yang, H., Bu, X., & Wang, L. (2019). Deep Learning for Mandarin-Tibetan Cross-Lingual Speech Synthesis. IEEE Access, 7, 167884–167894.
DOI: https://doi.org/10.26760/elkomika.v12i4.%25p
Refbacks
- Saat ini tidak ada refbacks.
_______________________________________________________________________________________________________________________
ISSN (cetak) : 2338-8323 | ISSN (elektronik) : 2459-9638
diterbitkan oleh :
Teknik Elektro Institut Teknologi Nasional Bandung
Alamat : Gedung 20 Jl. PHH. Mustofa 23 Bandung 40124
Kontak : Tel. 7272215 (ext. 206) Fax. 7202892
Surat Elektronik : jte.itenas@itenas.ac.id________________________________________________________________________________________________________________________
Statistik Pengunjung
Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.