SOICT 2025: THE 14TH INTERNATIONAL SYMPOSIUM ON INFORMATION AND COMMUNICATION TECHNOLOGY
PROGRAM

Days: Friday, December 12th Saturday, December 13th

Friday, December 12th

View this program: with abstractssession overviewtalk overview

08:50-09:30 Session 1: Keynote I: Vincent Wong (The University of British Columbia, Canada)

Integrated sensing and communication (ISAC) is a key technology for the sixth-generation (6G) wireless networks, where the same spectral and hardware resources are used for both communication and environmental sensing. Many optimization problems in ISAC require accurate sensing and communication channel models, which are often difficult to obtain. Machine learning (ML) is a powerful tool for solving ISAC problems by enabling data-driven solutions that can bypass the reliance on explicit models. This talk will explore how ML techniques can improve ISAC performance beyond traditional optimization approaches. Two case studies will be discussed: sensing-assisted predictive beamforming and cooperative sensing through ML. These examples will demonstrate the potential of ML to enable end-to-end signal processing for ISAC in 6G wireless networks.

Location: Ballrooom
09:30-10:10 Session 2: Keynote II: John C.S. Lui (The Chinese University of Hong Kong, Hong Kong)

In this talk, I will begin with a brief introduction to quantum computing, highlighting the importance and opportunities for pursuing fundamental research in the quantum Internet. In particular, I will discuss how quantum networks can enable quantum information transmission, parallel processing, and distributed processing. Next, I will introduce online learning theory and explain how it can help us explore compelling challenges in building quantum networks and the quantum Internet. To this end, I will delve into the quantum path selection problem, as well as the quantum border gateway protocol (QBGP) if time allows. Finally, I will outline several exciting open research problems at the intersection of quantum networks and quantum computing.

Location: Ballrooom
10:40-12:00 Session 3A: SOICT Technical Session I: Quantum Information
Location: Ballrooom A
10:40
Quantum Circuit Resource Assessment for ChaCha20 Stream Cipher (abstract)
11:00
EDM4QS: An Emulator-Driven Model for Quantum Scheduling (abstract)
11:20
Toward Acceleration of Variational Quantum Classifier Simulation on GPUs (abstract)
11:40
Performance Analysis of Quantum Federated Learning with Personalized Layer (abstract)
10:40-12:00 Session 3B: SOICT Technical Session II: AI Applications
Location: Ballrooom B
10:40
TriFusion: GNN-Based Multimodal Fusion for 3D Object Detection in Autonomous Driving (abstract)
11:00
A Novel Approach for Sino-Vietnamese Text Transcription by Leveraging a Pre-trained BERT and Self-Attention Mechanism (abstract)
11:20
A Comparison of Machine Learning Methods for Alzheimer's Disease Classification in Vietnamese Patients (abstract)
11:40
CodeLit: A Skill-Based Framework for Automated Assessment of Code Comprehension (abstract)
10:40-12:00 Session 3C: SOICT Technical Session III: Software Engineering
Location: Yersin A
10:40
CandleGen : Generating Synthetic OHLC Data for Different Market Trends using GANs (abstract)
11:00
Graph-based Multi-Agents for Text-to-SQL (abstract)
PRESENTER: Quoc-Hung Pham
11:20
URAG 2.0: An Agentic Dual Retrieval Framework for Enhanced Reasoning in RAG-based QA Systems (abstract)
11:40
Boosting Test Smell Prediction Using Deep Learning (abstract)
10:40-12:00 Session 3D: SOICT Technical Session IV: Networking and Communication Technologies
Location: Yersin B
10:40
Propagated Presence: A Bluetooth Propagation-Based Method for Automated Classroom Attendance on Mobile Devices (abstract)
11:00
An Evaluation on Defragmentation with CDC ROADMs in Elastic Optical Networks (abstract)
11:20
Fusing Gated Spatial-Channel Units and Fractal Cross-Scale Attention for Lightweight Waveform Classification (abstract)
11:40
A mobile-based attendance system using Bluetooth MAC address scanning (abstract)
10:40-12:00 Session 3E: Poster Exhibition
Feature Optimization for Improving Locust Detection (abstract)
HMCT: A Hybrid Multi-Scale CNNs- Transformer encoder for Fault Diagnosis in WSNs (abstract)
SafeGen: Embedding Ethical Safeguards in Text-to-Image Generation (abstract)
Evaluating Syllabus via Sub-Criteria: A Comparative Study of LLM and Experts (abstract)
ViTrustKOL: A Vietnamese Dataset for Consumer Trust Classification toward Key Opinion Leaders (abstract)
Efficient Caching for Conditional Flow Matching in Vietnamese Zero-Shot TTS (abstract)
A Robust Multi-Modal Framework for Explicit Content Detection in Digital Forensics via Adversarial-Resilient Ensemble Learning and Homomorphic Encryption (abstract)
A multimodal framework for Vietnamese Sign Language Recognition (abstract)
Addressing Data Scarcity and Imbalance in Depression Screening with Persona-Driven Synthetic Data (abstract)
Fish-Net: an Effective Model for Underwater Fish Detection (abstract)
VRAE: Vertical Residual Autoencoder for License Plate Denoising and Deblurring (abstract)
A Hybrid Quantum-Classical Machine Learning Framework for Robust Sepsis Detection Utilizing Immune Gene Signatures (abstract)
ViFin-MARS: A Question-Answering System for Financial News Dataset integrating User Intent Identification and Multi-Agent RAG Systems (abstract)
Polynomial-Augmented Instant Neural Graphics Primitives (abstract)
Improve the Effectiveness of Predicting Student Learning Outcomes using a MoE Networks with LSTM Routing (abstract)
Contrastive Preference Optimization for Low-Resource Vietnamese to Khmer Neural Machine Translation (abstract)
13:30-15:30 Session 4A: SOICT Technical Session V: Generative AI
Location: Ballrooom A
13:30
SEA-LION: Southeast Asian Languages in One Network
13:50
AD-GENESIS: Anomaly Detection through Gradient-Guided Generative Synthesis (abstract)
14:10
PRADA-QA: Product QA with Multi-Agent Planning and Dynamic Knowledge Retrieval (abstract)
14:30
Enhancing RAFT with Knowledge Graphs for Question Answering on Vietnamese Legal Texts (abstract)
14:50
Segmentation-Free Handwriting Recognition from Historical Handwritten Documents Using Large Vision-Language Models (abstract)
15:10
GenAI-Enabled Backlog Grooming in Agile Software Projects: An Empirical Study (abstract)
13:30-15:30 Session 4B: SOICT Technical Session VI: AI Applications
Chair:
Location: Ballrooom B
13:30
Optimization Approaches for Language Models in the Task of Translating Sino-Vietnamese Texts into Modern Vietnamese (abstract)
13:50
Motion-Gated Adaptive Filtering for Continuous Sign Language Recognition (abstract)
14:10
Fine-Tuning Large Language Models for Automated English Speaking Proficiency Assessment Using Multimodal Linguistic and Prosodic Features (abstract)
14:30
DRONEs: Deep Reinforcement Optimization for Network k-Connectivity Restoration Enhancement in UAVs (abstract)
14:50
XMedCLIP: A Multimodal Deep Neural Network for Bone Pathology Classification from X-ray Image (abstract)
15:10
Automated ESG classification by using Natural Language Processing Techniques from Vietnamese Company Annual Reports (abstract)
13:30-15:30 Session 4C: SOICT Technical Session VII: Applied Operations Research and Optimization
Location: Yersin A
13:30
Exponential Cone Reformulation for Scalable Estimation of Quantal Response and Multinomial Logit Models (abstract)
13:50
Reinforcement Learning-Enhanced GRASP for the Multiple Traveling Repairmen Problem with Workload Balance (abstract)
14:10
The Min-makespan Vehicle Routing Problem with Drones under Multiple Trips and Visits (abstract)
14:30
Grey Wolf Optimization with Entropy Control for Coverage in DSNs (abstract)
14:50
Modeling and Solving the Bin Packing Problem with Relaxed Capacity Constraints: Applications in Agricultural Land Consolidation in Vietnam (abstract)
13:30-15:30 Session 4D: SOICT Technical Session VIII: Multimedia Processing
Location: Yersin B
13:30
DTD-Mamba: Dual Teacher Distillation for Mamba in Head and Neck Abscess Segmentation (abstract)
13:50
VietMed-VQA: A Novel Dataset and Benchmark for Vietnamese Medical Visual Question Answering (abstract)
14:10
MasHeNe: A Benchmark for Head and Neck CT Mass Segmentation using Window-Enhanced Mamba with Frequency-Domain Integration (abstract)
14:30
An Optimization-Driven Fusion Framework of Vision–Language Foundation Models for Large-Scale Video Retrieval (abstract)
14:50
Text-Driven 3D Interior Scene Generation using 3D Gaussian Splatting (abstract)
15:10
When Events Speak: MLLM-Guided Video Retrieval with Temporal Reranking (abstract)
13:30-18:00 Session 4E: Poster Exhibition
VisionCare: Compute-Aware Hybrid CNN–Transformer Heads for Multi-Disease Retinal Diagnosis with Explainable AI (abstract)
An In-Depth Investigation into Vietnamese LexicalText Normalization on Social Media (abstract)
A Method for Composing Concerns into a Unified Domain Model in Domain-Driven Design (abstract)
MedPRS: Scientific Paper Submission Recommendation System for Medical Research (abstract)
Enhancing YOLOv11n for Reliable Child Detection in Noisy Surveillance Footage (abstract)
Accurate Mixed-Gas Concentration Prediction in Electronic Nose Using Image-Guided Autoencoder–TCN Hybrid Model (abstract)
Merging-based Federated Learning for Lifelong Whole Slide Image Analysis with Vision-Language Models (abstract)
Domain-Incremental Learning for UAV Traffic Video Anomaly Detection (abstract)
A Dual-Path approach for Time Series Anomaly Detection in Building Environmental Sensors (abstract)
FLoRA-KD: Efficient Communication in Federated Learning for Multi-Organ Segmentation through LoRA Knowledge Distillation (abstract)
Auto-Prompting with Retrieval Guidance for Frame Detection in Logistics (abstract)
Factors Influencing the Actual Use of AI-Enabled Chatbots in Digital Wallets for Personal Financial Management Among Vietnamese Online Users (abstract)
Toward Adaptive Web Application Honeypots: Fine-Tuned Large Language Models for Realistic Response Emulation (abstract)
GAFB-MKL: Adaptive Filter Banks via Genetic Algorithm and Sparse Multiple Kernel Learning for EEG-based Motor Imagery Classification (abstract)
Linguistic and Semantic Graph-based Neural Networks for Hate Speech Detection (abstract)
A deep learning model for drug-target interactions prediction in drug discovery (abstract)
Optimization of Resource Allocation Using SLA Violation Penalty and Workload Prediction in Cloud Datacenters (abstract)
Lightweight Multi-Trait IELTS Essay Scoring with Prompt- and Topic-Awareness (abstract)
ViLexCPO: A Multi-Task and Preference-Aligned Framework for Legal Question Answering (abstract)
MSA: Breaking Down MOET Criteria into Sub-Criteria for Education (abstract)
XGPhy: A Machine Learning Framework for Predicting Optimization Difficulty in Maximum Likelihood Phylogenetic Inference (abstract)
Enhancing User-Based Context-Aware Collaborative Filtering Using Energy Distance with Post-Filtering Contextual Features (abstract)
TI-FS: Text and Images Mutual Support for Improving Few-Shot Learning in Cross-Device Image Recapture Detection (abstract)
ViConBERT: Context-Gloss Aligned Vietnamese Word Embedding for Polysemous and Sense-Aware Representations (abstract)
LoDiBi: Automated Course Quality Evaluation Framework with LOQCA, DeepIFSA, and BiLSTM (abstract)
Exploring Consumer Behavior in Clean Food Consumption using Positive–Negative Association Rule Mining: A case study in Vietnam (abstract)
Optimization of Kolmogorov–Arnold Networks for Reinforcement Learning via NeuroEvolution of Augmenting Topologies (abstract)
Towards Regional AQI Mapping in Northern Vietnam: Multi-Source Data Fusion and Ensemble Learning (abstract)
Balanced Multimodal Training through Unified Forward-Backward Modulation Strategy (abstract)
Vehicle routing problems via Quantum Graph Attention Network Deep Reinforcement Learning (abstract)
TaP-GA: A Novel Genetic Algorithm for Target-Prioritized, Orientation-Constrained, and Adaptive Coverage Optimization in Wireless Multimedia Sensor Networks (abstract)
Fast Stochastic Greedy Algorithm for $k$-Submodular Cover Problem (abstract)
A Mathematical Model and Exact Column Generation Approach for RMSA Problem in Elastic Optical Networks (abstract)
An Improved Initialization-based Evolutionary Algorithm for the Top k 2-Clubs Problem (abstract)
Evaluating Phylogenetic and Ancestral Recombination Graph Approaches for Analyzing RNA Virus Recombination: A Case Study of SARS-CoV-2 in Vietnam (abstract)
Comprehensive Assessment of SLM Performance on Vietnamese High School History Tasks (abstract)
An Empirical Study of Multi-Agent RAG for Real-World University Admissions Counseling (abstract)
Synthesizing Cultural Heritage: An End-to-End System for Designing Jewelry with Vietnamese Hue Imperial Motifs (abstract)
Self-training from Self-memory in Data-to-text Generation (abstract)
Vietnamese-guided Post-OCR Processing for Historical Nom Scripts (abstract)
SelfCheckHybrid: A Hybrid Framework for Hallucination Detection in Vietnamese Large Language Models (abstract)
R2E - Requirements-to-Execution System (abstract)
P-PQGC: A Proposed Post Quantization Gain Control for Offline and Streaming Whisper under Different Speaker-to-Microphone Distances (abstract)
A No-Code Solution for Creating AR Indoor Navigation Applications (abstract)
Fast and Lightweight CNN Model for EEG Person Identification on Constrained Hardware (abstract)
SolARG: A Collaborative Tangible Augmented Reality Game for Learning Gravity and Solar System Planets (abstract)
CAMIronment: Supporting Environmental Design Prototyping With Generative AI and Context-Aware Multimodal Interaction (abstract)
Finite-time error control combining neural networks in noisy environments and mobile targets (abstract)
16:00-18:00 Session 5A: SOICT Technical Session IX: AI Applications, AI Foundations and Big Data
Location: Ballrooom A
16:00
FedEABoost: A Client Entropy Adaptive Boosting Framework for Federated Learning (abstract)
16:20
Entropy-Based Gradient Weighting and Batch-Size Adaptation for Virtual Data-Parallel Training (abstract)
16:40
AdaFRUGAL: Adaptive Memory-Efficient Training with Dynamic Control (abstract)
17:00
Part-GNN: A partitioning-based graph neural network for efficient memory large scale data classification (abstract)
17:20
Enhancing Survey Efficiency: A Validated Vietnamese Short-Form of the MBTI Developed Through Machine Learning (abstract)
17:40
CITADEL: A Web-Based Faculty Performance Evaluation and Decision-Support System for Higher Education Institutions (abstract)
18:00
AUF iAssist: A Web-Based Helpdesk System for Efficient Support and Concern Resolution (abstract)
16:00-18:00 Session 5B: SOICT Technical Session X: AI Applications
Chair:
Location: Ballrooom B
16:00
GRACE: A Knowledge Graph–Enhanced Conversational Recommendation System via Retrieval-Augmented Generation (abstract)
16:20
Effectiveness of Rolling-Sum Preprocessing in River Mouth Water Depth Prediction Using Machine Learning (abstract)
16:40
Enhance Sequential Recommendation via Linear Recurrent Units (abstract)
17:00
Aspect-Based Sentiment Analysis for Stock Price Movement Prediction (abstract)
17:20
Tokenization in Protein Language Models: Methods, Taxonomy, and Applications (abstract)
16:00-18:00 Session 5C: SOICT Technical Session XI: Applied Operations Research and Optimization
Location: Yersin A
16:00
Non-Parametric Feature Combination For Explainable Credit Scoring (abstract)
16:20
Deterministic one-pass streaming algorithm for non-monotone DR-submodular maximization under a size constraint (abstract)
16:40
DESW: Reducing Concentration in Proof-of-Stake with Dynamic Exponential Stake Weighting (abstract)
17:00
Balancing Efficiency and Fairness in the Integrated Truck–Drone Dispatching Problem with Dynamic Endurance via Pareto Front Grid Guided Multi-objective Optimization (abstract)
17:20
Budgeted Object Detection via Online Submodular Approximation Algorithm (abstract)
16:00-18:00 Session 5D: SOICT Technical Session XII: Networking and Communication Technologies,Software Engineering
Location: Yersin B
16:00
A Lightweight and Robust Framework for Waveform Classification Using Dynamic Warping and State-Space Models (abstract)
16:20
Channel-Aware Power and Rate Control for UOWC with DRL and HARQ Integration (abstract)
16:40
Threshold-based AP Filtering and Distance Measure Analysis for K-means Clustering in WiFi Fingerprinting-based Indoor Localization System (abstract)
17:00
A Bounded Model Checking Approach for Verifying OSEK/VDX Applications (abstract)
17:20
UAV-Based Target Terminal Search System for Emergency Rescue (abstract)
Saturday, December 13th

View this program: with abstractssession overviewtalk overview

08:30-09:10 Session 7: Keynote III: Josiah Poon (University of Sydney, Australia)

Like water that adapts to any container, documents need Natural Language Processing (NLP) systems that adapt and move across modalities pages and scales to find verifiable evidence. In this talk, I will share a practical agenda to build NLP systems that ingest text, images, layout, tables and figures and produce traceable answers. We emphasise three pillars: integration, learning and retrieval. Integration: fuse multimodal features and layout aware encodings so text and visual content are interpreted together. Learning: train specialist teachers across modalities and distil their feature knowledge into compact deployable students for NLP tasks. Retrieval: adopt a retrieval first approach, using multipage and multimodal retrieval to find candidate passages, tables and figures, then chain those candidates into a clear evidence trail. I demonstrate how graph-based encodings and multiscale reasoning work together, and how multiteacher distillation compacts expert knowledge into deployable students. Then, with concise multimodal case studies and retrieval centric metrics, I show measurable gains in evidence grounding, generalisation and operational readiness. I conclude with practical measures to control complexity and annotation cost, and present simple experiments and evaluation criteria for different domains.

Location: Ballrooom
09:10-09:50 Session 8: Keynote IV: Tung Kum Hoe Anthony (National University of Singapore, Singapore)

In an era dominated by massive foundation models, smaller players risk being left behind—unable to afford the scale, data, or manpower that large AI systems demand. This talk introduces the concept of Prudent AI—an approach that emphasizes right-sized, lightweight, and explainable intelligence delivered through just-in-time, Plug-and-Play AI Boxes. Focusing on applications like early anomaly detection in multivariate time series, we demonstrate how our AI Boxes use sparse data, minimal compute, and human-guided refinement to detect rare but critical events. The architecture integrates symbolic reasoning, data-driven refinement, and secure edge deployment, showing how being small can actually be a strength in resource-constrained settings. Through this, we reimagine how organizations can adopt AI that is transparent, agile, and sustainable.

Location: Ballrooom
10:20-12:00 Session 9A: SOICT Technical Session XIII: Lifelog Event Retrieval
Location: Ballrooom A
10:20
Toward Abstraction-Level Event Retrieval in Large Video Collections: Leveraging Human Knowledge and LLM-Based Reasoning in the Ho Chi Minh City AI Challenge 2025 (abstract)
10:40
Real-Time Hybrid Multimodal Retrieval System for AI Challenge HCMC 2025 (abstract)
11:00
Towards Conversational Video Retrieval with an Intelligent Search Agent (abstract)
11:20
Applying Large Language Model (LLM) Agents for Automated Lifelog Retrieval (abstract)
11:40
Leveraging Composed Image Retrieval Principles for Efficient Textual Feedback in Multimodal Retrieval (abstract)
12:00
Estimating size of lesions in Endoscopic Images using depth model-based approaches (abstract)
10:20-12:00 Session 9B: SOICT Technical Session XIV: AI Applications
Location: Ballrooom B
10:20
FA-Net: A Dual-Branch Attention Architecture for Extracting Fine-Grained Anatomical Features of Wood (abstract)
10:40
Adaptive Rainfall Forecasting from Multiple Geographical Models Using Matrix Profile and Ensemble Learning (abstract)
11:00
Toward a Culture‑Aware Vietnamese Mental Health Support Chatbot with Large Language Models (abstract)
11:20
MiRAGE: Misconception Detection with Retrieval-Guided Multi-Stage Reasoning and Ensemble Fusion (abstract)
10:20-12:00 Session 9C: SOICT Technical Session XV: Human Computer Interaction and Intelligent Interactive Systems
Location: Yersin A
10:20
A Co-Simulation Approach for UAV-Network-AI Interaction in Digital Twin Visual Context (abstract)
10:40
Fairy360VR: Immersive 360° Storytelling with Large Language Models and Generative Diffusion (abstract)
11:00
Enhancing VR Drink Taste Believability using Olfactory Stimulation (abstract)
11:20
An eye-tracking system for extracting and visualizing visual features of dyscalculia in children (abstract)
11:40
MO-PO RM: A Collaborative Mixed Reality Board Game for Engaging Players and Audience in Learning through Playing (abstract)
10:20-12:00 Session 9D: SOICT Technical Session XVI: Multimedia Processing
Chair:
Location: Yersin B
10:20
LOGOS: Language-guided Oriented Object Detection in Aerial Scenes (abstract)
10:40
From Text to Thumbnail: A Unified Framework for Automated News Image Generation and Evaluation for Daily Activities (abstract)
11:00
Self-Supervised ViT for Endoscopy: I-JEPA Pretraining with Label-Free Diffusion Assessment (abstract)
11:20
Generalizability Evaluation and Anchor-Guided Approach for Category-Agnostic Pose Estimation (abstract)
11:40
RIOT: Robust Incremental Few-Shot Instance Segmentation via Synthetic Feature Generation with Optimal Transport (abstract)
10:20-12:00 Session 9E: Poster Exhibition
Integrated Semantic and Temporal Alignment for Interactive Video Retrieval (abstract)
HelioSearch: A Multimodal Video Retrieval Framework with LLM-Driven Query Expansion and Hybrid Filtering (abstract)
Enhanced Multimodal Video Retrieval System: Integrating Query Expansion and Cross-modal Temporal Event Retrieval (abstract)
VidAlign: Integrating Multi-Event Alignment and LLM Co-Searching for Video Retrieval (abstract)
PerceptionBrowswer: Enhancing information retrieval system with spatial-temporal knowledge (abstract)
TARS: Temporal Alignment Retrieval System for Efficient Multi-Segment Video Event Retrieval (abstract)
KPT: Enhancing Temporal Event Retrieval in Vietnamese News Videos (abstract)
TEMPO: A Multimodal Video Retrieval System with Sequential Query Support (abstract)
FRED: Unified Multimodal Fusion and Dynamic Temporal Reasoning with Semantic Query Expansion and Exclusionary Search (abstract)
A Video Retrieval System with Advanced Temporal Algorithm and Vision Language Models Integration (abstract)
FrameSeeker: Shot-Level Captioning with Multimodal Hints for Efficient Video Retrieval (abstract)
Adaptive Agent-Guided Dynamic Programming for Temporal Optimization in Multi-Event Video Retrieval (abstract)
GalaxyAssistant: An Intelligent Assistant for Multimedia Event Retrieval (abstract)
Text-Guided Filtering to Enhance Open-Vocabulary Object Detection for Sport Event Retrieval (abstract)
PRESENTER: Duc-Thang Nguyen
Fusurge: An Accelerated Query-Driven System for Multimodal Information Retrieval (abstract)
ATLAS: Adaptive Temporal Low-rank Alignment System for AI Challenge 2025 (abstract)
MERVIN: A Unified Framework for Multimodal Event Retrieval in Vietnamese News Videos (abstract)
Aligning Time and Semantics (ATS): A System for Temporal Retrieval and Alignment of Key Events (abstract)
Unlocking Arbitrary-Length Querying for Video Retrieval via Advanced Vision-Language Models and Hybrid Temporal Search (abstract)
Tournament-Inspired Elimination Reranking for Multi-Modal Video Retrieval (abstract)
Multi-modal and Temporally-aware Video Retrieval (abstract)
Cross Segment Coherence Scorer: A Training Free Temporal Framework for Multimodal Video Retrieval (abstract)
Poly-Temporal Seach: Bridging Composed and Temporal Queries for Multimodal Video Retrieval (abstract)
LGCA: Enhancing Semantic Representation via Progressive Expansion (abstract)
Visual Retrieval-Augmented Generation for Silhouette-Guided Animal Art (abstract)
FLUID: Flow-Latent Unified Integration via Token Distillation for Expert Specialization in Multimodal Learning (abstract)
Research Paper Quality Recognition Through Textual Feature Analysis (abstract)
Efficient Probabilistic Cross-Modal Retrieval via Top-k Selection and Fast Embedding Learning (abstract)
Text-Based Person Search in Low-Resource Scenarios (abstract)
GigaCount: Enhancing Crowd Counting by Integrating a Multi-Scale Feature Fusion Model into CLIP-EBC (abstract)
Integrating Motion-based Technique and Deep Learning for Expression Analysis in Vietnamese Traditional Chèo (abstract)
VisionGuard: Synergistic Framework for Helmet Violation Detection (abstract)
Edit3DGS: Unified Framework for Dynamic Head Editing via 2D Instruction-Guided Diffusion and 3D Gaussian Splatting (abstract)
VNProductKIE: A Dataset and Three-Stage Pipeline for Key Product Information Recognition on Vietnamese Packaging Labels (abstract)
MMCS: Multimodal Mamba Channel Switching for Object Detection via RGB-IR Fusion (abstract)
Balancing Quality, Speed, and Compactness of 3D Gaussian Splatting (abstract)
OTGen-FSIS: Optimal Transport–Driven Feature Generation for Few-Shot Instance Segmentation (abstract)
DAKTA: Directional Kolmogorov-Arnold Classifier for Task Arithmetic in Continual Learning (abstract)
CIAN: Multi-Stage Framework for Event-Enriched Image Captioning via Retrieval-Augmented Generation (abstract)
Improving Code-Switching Speech Synthesis via Concatenated Tokenizers (abstract)
Impact of Foggy Weather on Anomaly Detection in Aerial Traffic Surveillance Videos: An In-Depth Analysis (abstract)
Lightweight digital signature algorithms based on linear public-key (abstract)
Exploring Multi-Modal Large Language Models and Two-Stage Fine-Tuning for Fashion Image Retrieval (abstract)
A RGB-D Dataset of Isolated Vietnamese Sign Language (abstract)
13:30-15:30 Session 10A: SOICT Technical Session XVII: Lifelog Event Retrieval
Location: Ballrooom A
13:30
U-CESE: Unified Clip-based Event Search Engine for AI Challenge HCMC 2025 (abstract)
13:50
Visionary: Optimized Temporal Video Retrieval via Large Language Model-Enhanced Query Processing (abstract)
14:10
KPTER: K-Pointer for Temporal Event Retrieval (abstract)
14:30
MADTempo: An Interactive System for Multi-Event Temporal Video Retrieval with Query Augmentation (abstract)
14:50
AIthena-Vision: Adaptive Temporal Multimodal Event Retrieval with LLM-generated Multiperspective Fusion (abstract)
15:10
Lucifer-TRACE: Dynamic Programming and LVLM-Aided Verification for Event-Based Video Retrieval (abstract)
13:30-15:30 Session 10B: SOICT Technical Session XVIII: AI Applications
Location: Ballrooom B
13:30
The Privacy–Utility Trade-off in Brain MRI Synthesis: A Comparative Framework for Generative Models (abstract)
13:50
Task-Aware Harmonization of Sentinel-2 for Canopy Height Mapping: A Deep Learning Application in the Ngoc Linh Mountains, Vietnam (abstract)
14:10
Adaptive Multi-Level Attention for Effective Cross-Domain Brain Tumor Detection (abstract)
14:30
Critical Success Factors for AI Adoption: A Multivocal Literature Review and a Top Management Perspective (abstract)
14:50
A Computational Framework for the Personalized Remediation of Reading Difficulties Using Dynamic Bayesian Networks (abstract)
15:10
Towards Reliable Oriented Surgical Instrument Detection: Benchmark and Evaluation (abstract)
13:30-15:30 Session 10C: SOICT Technical Session XIX: Recent Advances in Cyber Security
Location: Yersin A
13:30
Robust Intrusion Detection and Classification in EVSE Using Ensemble Methods (abstract)
13:50
FOAMI: Enhancing ICS Threat Detection via Feature Optimization, Realistic Augmentation, and Mutual Inference (abstract)
14:10
A Novel Framework for Android Malware Detection Based on Function Call Graph Pruning and Contrastive Learning (abstract)
14:30
MPPO-GEM: Reinforcement Learning Approach for Generating Evasive Malware against Static and Dynamic Malware Detectors (abstract)
14:50
Pri-WeDec: A Private Deep Learning Approach for Weapon Detection in Digital Forensics (abstract)
15:10
Few-Shot Intrusion Detection using Model-Agnostic Meta-Learning with Deep Neural Networks (abstract)
13:30-15:30 Session 10D: SOICT Technical Session XX: Multimedia Processing
Chair:
Location: Yersin B
13:30
Scene Graph for Vietnamese Video Understanding: An Agentic Approach with Reasoning (abstract)
13:50
OpenLifelogQA: An Open-Ended Multimodal Lifelog Question-Answering Dataset (abstract)
14:10
EnAug: ENT Endoscopy Images Classification Using Ensemble and Augmentation Methods (abstract)
14:30
EDGER: EDge-Guided with HEatmap Refinement for Generalizable Image Forgery Localization (abstract)
14:50
Hierarchical Multi-Modal Retrieval for Knowledge-Grounded News Image Captioning (abstract)
15:10
From Relative to Absolute: Monocular Depth Estimation in Aerial Imagery (abstract)
13:30-17:20 Session 10E: Poster Exhibition
Anatomy-based Brain Hemorrhage Segmentation and Application in Assessment of Traumatic Brain Injury Severity (abstract)
House Price Prediction via Attribute, Visual, and Economic Features (abstract)
AEye: Avian Monitoring from Streaming Videos (abstract)
Optimizing UAV Swarm Routing with Optical Communication Systems (abstract)
Practical multivariate algebraic signature scheme with one hidden group (abstract)
QuantaMind: A Robust and Efficient Framework for Quantum Machine Learning Applications (abstract)
GENLog: Enhance Generalization to Log-based Anomaly Detection (abstract)
Architecting Trustworthy AI: The Cyber-Resilient AI (CRAI) Framework (abstract)
Adaptive Federated Learning for Software Vulnerability Detection (abstract)
A Method for Building QA Corpora for Low-Resource Languages (abstract)
JuanQueue: A Digital Appointment and Queuing System for a Government Organization (abstract)
Smart Mobility through Hybrid Offline-Online Scheduling for Ridesharing (abstract)
16:00-17:20 Session 11A: SOICT Technical Session XXI: Lifelog Event Retrieval
Location: Ballrooom A
16:00
CLIPAR: Multimodal and Temporal-Aware Video Retrieval System (abstract)
16:20
Vortex: A Multi-Modal Fusion System for Intelligent Video Retrieval (abstract)
16:40
Efficient Video Retrieval for Less-Resourced Languages via Multi-Modal Semantic Search (abstract)
16:00-17:20 Session 11B: SOICT Technical Session XXII: AI Applications
Chair:
Location: Ballrooom B
16:00
AuMoM: A Framework for Learning Discriminative Speaker Embeddings using a Mamba-based Mixture of Experts and Contrastive Loss (abstract)
16:20
A Survey on Challenges and Emerging Frontiers of Multi-Agent Systems (abstract)
16:40
Improving Plant Species Distribution Models with Hydrologic and Topographic Features (abstract)
16:00-17:20 Session 11C: SOICT Technical Session XXIII: Recent Advances in Cyber Security
Chair:
Location: Yersin A
16:00
Password Generation Based on GenAI for Evaluating the Security of Password-Based Control Systems (abstract)
16:20
FusionMalNet: A Hybrid Ensemble Architecture for Windows Malware Detection (abstract)
16:40
PowerGAN: Enhancing PowerShell Attack Detection through GAN-Driven Data Generation (abstract)
16:00-17:20 Session 11D: SOICT Technical Session XXIV: Multimedia Processing
Chair:
Location: Yersin B
16:00
SimGraph: A Unified Framework for Scene Graph-Based Image Generation and Editing (abstract)
16:20
Forged Calamity: Benchmark for Cross-Domain Synthetic Disaster Detection in the Age of Diffusion (abstract)
16:40
HERF: Hybrid Evidence Retrieval Framework for Entity-Centric Question Answering (abstract)