I am actively recruiting PhD students at Indian Institute of Technology (IIT) Bhubaneswar !
Dr. Devashree Tripathy is an Assistant Professor in Computer Science and Engineering at Indian Institute of Technology, Bhubaneswar. Her research spans in Computer Architecture , Machine Learning, Hardware/Software codesign of AI systems, Energy-Efficient computing, Sustainable AI, TinyML, and Systems for Healthcare, Drones, Autonomous Cars and Robotics, Mobile and Edge Computing. Her current research focuses on ML for Systems and Systems for ML. Prior to joining IIT Bhubaneswar, Dr. Tripathy was a Postdoctoral Fellow in Computer Science at Harvard University in Harvard Architecture, Circuits, and Compilers Group. She is currently working as an Associate with Harvard University. She graduated from University of California, Riverside with PhD in Computer Science.
• Dean’s Distinguished Fellowship
• Award of Excellence
• Quick-Hire Fellowship by Government of India, CSIR.
• Governor’s Gold Medal for being ranked first among all undergraduate students in university. | University Topper
A2 |
Guac: Energy-Aware and SSA-Based Generation of Coarse-Grained Merged Accelerators from LLVM-IR. Arxiv '24
@article{brumar2024guac,
title={Guac: Energy-Aware and SSA-Based Generation of Coarse-Grained Merged Accelerators from LLVM-IR},
author={Brumar, Iulian and Rocha, Rodrigo and Bernat, Alex and Tripathy, Devashree and Brooks, David and Wei, Gu-Yeon},
journal={arXiv preprint arXiv:2402.13513},
year={2024}
}
|
A1 |
PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices Arxiv '23
@article{chai2023perfsage,
title={PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep Learning Models on Edge Devices},
author={Chai, Yuji and Tripathy, Devashree and Zhou, Chuteng and Gope, Dibakar and Fedorov, Igor and Matas, Ramon and Brooks, David and Wei, Gu-Yeon and Whatmough, Paul},
journal={arXiv preprint arXiv:2301.10999},
year={2023}
}
|
C7 |
ArchGym: Establishing Stronger Baselines for Machine-Learning Assisted Architecture Design. ISCA '23
@inproceedings{krishnan2023archgym,
title={ArchGym: An Open-Source Gymnasium for Machine Learning Assisted Architecture Design},
author={Krishnan, Srivatsan and Yazdanbakhsh, Amir and Prakash, Shvetank and Jabbour, Jason and Uchendu, Ikechukwu and Ghosh, Susobhan and Boroujerdian, Behzad and Richins, Daniel and Tripathy, Devashree and Faust, Aleksandra and others},
booktitle={Proceedings of the 50th Annual International Symposium on Computer Architecture},
pages={1--16},
year={2023}
}
|
C6 |
LocalityGuru: A PTX Analyzer for Extracting Thread Block-level Locality in GPGPUs NAS '21
@INPROCEEDINGS{9605411,
author={Tripathy, Devashree and Abdolrashidi, AmirAli and Fan, Quan and Wong, Daniel and Satpathy, Manoranjan},
booktitle={2021 IEEE International Conference on Networking, Architecture and Storage (NAS)},
title={LocalityGuru: A PTX Analyzer for Extracting Thread Block-level Locality in GPGPUs},
year={2021},
volume={},
number={},
pages={1-8},
doi={10.1109/NAS51552.2021.9605411}}
|
C5 |
ICAP: Designing Inrush Current Aware Power Gating Switch for GPGPU NAS '21
@INPROCEEDINGS{9605434,
author={Zamani, Hadi and Tripathy, Devashree and Jahanshahi, Ali and Wong, Daniel},
booktitle={2021 IEEE International Conference on Networking, Architecture and Storage (NAS)},
title={ICAP: Designing Inrush Current Aware Power Gating Switch for GPGPU},
year={2021},
volume={},
number={},
pages={1-8},
doi={10.1109/NAS51552.2021.9605434}}
|
C4 |
Slumber: Static-Power Management for GPGPU Register Files ISLPED '20
@inproceedings{tripathy2020slumber,
title={Slumber: static-power management for gpgpu register files},
author={Tripathy, Devashree and Zamani, Hadi and Sahoo, Debiprasanna and Bhuyan, Laxmi N and Satpathy, Manoranjan},
booktitle={Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design},
pages={109--114},
year={2020}
}
|
C3 |
SAOU: Safe Adaptive Overclocking and Undervolting for Energy-Efficient GPU Computing ISLPED '20
@inproceedings{zamani2020saou,
title={SAOU: safe adaptive overclocking and undervolting for energy-efficient GPU computing},
author={Zamani, Hadi and Tripathy, Devashree and Bhuyan, Laxmi and Chen, Zizhong},
booktitle={Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design},
pages={205--210},
year={2020}
}
|
C2 |
GreenMM: Energy-Efficient GPU Matrix Multiplication Through Undervolting ICS '19
@inproceedings{DBLP:conf/ics/ZamaniLTBC19,
author = {Hadi Zamani and
Yuanlai Liu and
Devashree Tripathy and
Laxmi N. Bhuyan and
Zizhong Chen},
title = {GreenMM: energy efficient {GPU} matrix
multiplication through undervolting},
booktitle = {Proceedings of the {ACM} International
Conference on Supercomputing,
{ICS} 2019, Phoenix, AZ, USA, June 26-28, 2019},
pages = {308--318},
year = {2019},
crossref = {DBLP:conf/ics/2019},
url = {https://doi.org/10.1145/3330345.3330373},
doi = {10.1145/3330345.3330373},
timestamp = {Wed, 19 Jun 2019 08:40:19 +0200},
biburl = {https://dblp.org/rec/bib/conf/ics/ZamaniLTBC19},
bibsource = {dblp computer science bibliography,
https://dblp.org}
}
|
C1 |
WIREFRAME: Supporting Data-dependent Parallelism through Dependency Graph Execution in GPUs. MICRO '17
@inproceedings{abdolrashidi2017wireframe,
title={Wireframe: supporting data-dependent
parallelism through dependency graph execution
in GPUs},
author={Abdolrashidi, Amir Ali and Tripathy,
Devashree and Belviranli, Mehmet Esat and
Bhuyan, Laxmi Narayan and Wong, Daniel},
booktitle={Proceedings of the 50th Annual
IEEE/ACM International Symposium on
Microarchitecture},
pages={600--611},
year={2017},
organization={ACM}
}
|
J4 |
FARSI: Early-stage Design Space Exploration Framework to Tame the Domain-specific System-on-chip Complexity ACM TRANSACTIONS TECS'22@article{boroujerdian2023farsi, title={FARSI: An early-stage design space exploration framework to tame the domain-specific system-on-chip complexity}, author={Boroujerdian, Behzad and Jing, Ying and Tripathy, Devashree and Kumar, Amit and Subramanian, Lavanya and Yen, Luke and Lee, Vincent and Venkatesan, Vivek and Jindal, Amit and Shearer, Robert and others}, journal={ACM Transactions on Embedded Computing Systems}, volume={22}, number={2}, pages={1--35}, year={2023}, publisher={ACM New York, NY} } |
J3 |
PAVER: Locality Graph-based Thread Block Scheduling for GPUs ACM TRANSACTIONS TACO'21@article{10.1145/3451164, author = {Tripathy, Devashree and Abdolrashidi, Amirali and Bhuyan, Laxmi Narayan and Zhou, Liang and Wong, Daniel}, title = {PAVER: Locality Graph-Based Thread Block Scheduling for GPUs}, year = {2021}, issue_date = {September 2021}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, volume = {18}, number = {3}, issn = {1544-3566}, url = {https://doi.org/10.1145/3451164}, doi = {10.1145/3451164}, abstract = {The massive parallelism present in GPUs comes at the cost of reduced L1 and L2 cache sizes per thread, leading to serious cache contention problems such as thrashing. Hence, the data access locality of an application should be considered during thread scheduling to improve execution time and energy consumption. Recent works have tried to use the locality behavior of regular and structured applications in thread scheduling, but the difficult case of irregular and unstructured parallel applications remains to be explored.We present PAVER, a Priority-Aware Vertex schedulER, which takes a graph-theoretic approach toward thread scheduling. We analyze the cache locality behavior among thread blocks (TBs) through a just-in-time compilation, and represent the problem using a graph representing the TBs and the locality among them. This graph is then partitioned to TB groups that display maximum data sharing, which are then assigned to the same streaming multiprocessor by the locality-aware TB scheduler. Through exhaustive simulation in Fermi, Pascal, and Volta architectures using a number of scheduling techniques, we show that PAVER reduces L2 accesses by 43.3%, 48.5%, and 40.21% and increases the average performance benefit by 29%, 49.1%, and 41.2% for the benchmarks with high inter-TB locality.}, journal = {ACM Trans. Archit. Code Optim.}, month = {jun}, articleno = {32}, numpages = {26}, keywords = {thread block, dependency graph, locality, GPGPU} } |
J2 |
An improved load‑balancing mechanism based on deadline failure recovery on GridSim EC, Springer'16@Article{Patel2016, author="Patel, Deepak Kumar and Tripathy, Devashree and Tripathy, Chitaranjan", title="An improved load-balancing mechanism based on deadline failure recovery on GridSim", journal="Engineering with Computers", year="2016", month="Apr", day="01", volume="32", number="2", pages="173--188", abstract="Grid computing has emerged a new field, distinguished from conventional distributed computing. It focuses on large-scale resource sharing, innovative applications and in some cases, high performance orientation. The Grid serves as a comprehensive and complete system for organizations by which the maximum utilization of resources is achieved. The load balancing is a process which involves the resource management and an effective load distribution among the resources. Therefore, it is considered to be very important in Grid systems. For a Grid, a dynamic, distributed load balancing scheme provides deadline control for tasks. Due to the condition of deadline failure, developing, deploying, and executing long running applications over the grid remains a challenge. So, deadline failure recovery is an essential factor for Grid computing. In this paper, we propose a dynamic distributed load-balancing technique called ``Enhanced GridSim with Load balancing based on Deadline Failure Recovery'' (EGDFR) for computational Grids with heterogeneous resources. The proposed algorithm EGDFR is an improved version of the existing EGDC in which we perform load balancing by providing a scheduling system which includes the mechanism of recovery from deadline failure of the Gridlets. Extensive simulation experiments are conducted to quantify the performance of the proposed load-balancing strategy on the GridSim platform. Experiments have shown that the proposed system can considerably improve Grid performance in terms of total execution time, percentage gain in execution time, average response time, resubmitted time and throughput. The proposed load-balancing technique gives 7 {\%} better performance than EGDC in case of constant number of resources, whereas in case of constant number of Gridlets, it gives 11 {\%} better performance than EGDC.", issn="1435-5663", doi="10.1007/s00366-015-0409-y", url="https://doi.org/10.1007/s00366-015-0409-y" } |
J1 |
Survey of load balancing techniques for Grid JNCS, Elsevier'16@article{PATEL2016103, title = "Survey of load balancing techniques for Grid", journal = "Journal of Network and Computer Applications", volume = "65", number = "", pages = "103 - 119", year = "2016", note = "", issn = "1084-8045", doi = "http://dx.doi.org/10.1016/j.jnca.2016.02.012", url = "http://www.sciencedirect.com/science/article/pii/S1084804516000953", author = "Deepak Kumar Patel and Devashree Tripathy and C.R. Tripathy", keywords = "Grid computing", keywords = "Distributed systems", keywords = "Load balancing" } |
B1 |
Real-Time BCI System Design to Control Arduino Based Speed Controllable Robot Using EEG Springer '18
@book{das2018real,
title={Real-Time BCI System Design to Control Arduino Based Speed Controllable Robot Using EEG},
author={Das, Swagata and Tripathy, Devashree and Raheja, Jagdish Lal},
year={2018},
publisher={Springer}
}
|
BC-1 |
Design and Implementation of Brain Computer Interface Based Robot Motion Control Springer'14 |
Welcome! you are the th vistor of my homepage.