Abstract: Document Visual Question Answering (DocVQA) offers a promising approach to extracting insights from large document corpora. However, existing benchmarks focus on evaluating multi-modal ...
Abstract: Traditionally Photovoltaic (PV) power generation forecasting is based on numeric meteorological vectors to capture weather conditions, which generally misses valuable multi-modal information ...