Job scheduling strategies for parallel processing : 24th international workshop, JSSPP 2021, virtual event, May 21, 2021 : revised selected papers 🔍
Dalibor Klusáček; Walfredo Cirne; Gonzalo P. Rodrigo
Springer International Publishing : Imprint: Springer, Springer Nature, Cham, 2021
英语 [en] · PDF · 17.7MB · 2021 · 📘 非小说类图书 · 🚀/lgli/lgrs/upload · Save
描述
This book constitutes the thoroughly refereed post-conference proceedings of the 24th International Workshop on Job Scheduling Strategies for Parallel Processing, JSSPP 2021, held as a virtual event in May 2021 (due to the Covid-19 pandemic). The 10 revised full papers presented were carefully reviewed and selected from 17 submissions. In addition to this, one keynote paper was included in the workshop. The volume contains two sections: Open Scheduling Problems and Proposals and Technical Papers. The papers cover such topics as parallel computing, distributed systems, workload modeling, performance optimization, and others.
备用文件名
lgli/6145.pdf
备用文件名
lgrsnf/6145.pdf
备选作者
Dalibor Klusáček,Walfredo Cirne,Gonzalo P. Rodrigo,Gerhard Goos,Juris Hartmanis,Elisa Bertino,Wen Gao,Bernhard Steffen,Gerhard Woeginger,Moti Yung
备选作者
Dalibor Klusáĉek; Walfredo Cirne; Gonzalo P Rodrigo
备选作者
Alfred Bekker; Carol East; Ann Murdoch
备选作者
JSSPP (Workshop)
备用出版商
Springer Nature Switzerland AG
备用版本
Lecture Notes in Computer Science, 1st edition 2021, Cham, 2021
备用版本
Lecture notes in computer science, 12985, Cham, 2021
备用版本
LNCS sublibrary, 1st ed. 2021, Cham :, 2021
备用版本
Switzerland, Switzerland
备用版本
1, 20211005
元数据中的注释
producers:
Springer-i
Springer-i
备用描述
Preface 6
Organization 9
Contents 10
Keynote 12
Resampling with Feedback: A New Paradigm of Using Workload Data for Performance Evaluation 13
1 Introduction 13
2 Background 15
3 Using Workload Logs and Models to Drive Simulations 16
3.1 Workload Modeling 16
3.2 Problems with Models 17
3.3 Using Logs Directly 20
3.4 Drawbacks of Using Logs 21
4 Resampling and Feedback 25
4.1 Before Resampling: Input Shaking and User-Based Modeling 25
4.2 Resampling from a Log 26
4.3 Adding Feedback 29
4.4 Applications and Benefits 32
5 Conclusions 37
References 39
Open Scheduling Problems and Proposals 43
Collection of Job Scheduling Prediction Methods 44
1 Introduction 44
2 Background 45
3 Predictions 46
3.1 Job Workloads and Metadata 46
3.2 Metrics 48
3.3 Repository 49
4 Conclusion and Outlook 49
References 49
Modular Workload Format: Extending SWF for Modular Systems 52
1 Introduction and Motivation 52
2 Standard Workload Format 54
3 Heterogeneous Systems Requirements 56
4 Modular Workload Format (MWF) Proposal 56
4.1 Modular Workload Format Fields 57
4.2 Headers 59
5 From SWF to MWF 60
6 MWF Experiences 60
7 Conclusions 63
References 64
Technical Papers 65
Measurement and Modeling of Performance of HPC Applications Towards Overcommitting Scheduling Systems 66
1 Introduction 66
2 Background 67
2.1 Evaluation Metrics for Scheduling 67
2.2 Features of Interactive Jobs 68
3 Overcommitting on Supercomputer Systems 69
3.1 Node-Level Overcommitting 69
3.2 Processor Core-Level Overcommitting 69
3.3 Overcommitting Scheduling System 71
4 Methodology for Evaluation Under Overcommitting 72
4.1 Target Workload 73
4.2 Experiment Environment for Overcommitting 73
4.3 Definition of Performance Degradation Ratio 74
5 Performance Evaluation Under Overcommitting 76
5.1 Two Batch Jobs 76
5.2 A Batch Job and an Interactive Job 77
6 Performance Modeling with Overcommitting 79
6.1 Overview of Prediction Model 80
6.2 Phase 1: Dangerous Application Detection Model 80
6.3 Phase 2: Degradation Prediction Model 81
6.4 Validation of the Model 82
7 Related Work 83
8 Conclusion and Future Work 85
References 86
Scheduling Microservice Containers on Large Core Machines Through Placement and Coalescing 87
1 Introduction 87
2 Related Work 90
3 Implementation of TRACPAD 92
3.1 The Container Resource Utilization Model 92
3.2 Collection of Container Specific Data 93
3.3 Monitoring Network Traffic 93
3.4 Generation of the TRACPAD Partitioning Scheme 93
3.5 Dynamic Provisioning of Resources 94
4 Experimental Setup 94
5 Microservice Benchmarks and Workloads 95
5.1 The Social Network Application 95
6 Evaluation of the TRACPAD Scheduler 96
7 Analysis of TRACPAD 98
8 Factors Affecting the Scheduling Policy 100
8.1 Effect of Database Size on Scheduling Policy 100
8.2 Effect of Request Packet Size on Scheduling Policy 100
9 Container Coalescing 101
9.1 Design Considerations 101
9.2 Methodology 102
9.3 Experimental Setup 102
9.4 Results 102
10 Conclusion and Further Work 104
References 104
Learning-Based Approaches to Estimate Job Wait Time in HTC Datacenters 108
1 Introduction 108
2 Related Work 110
3 Wait Time Distribution and Intuitive Causes 110
3.1 Submission Period 113
3.2 Number of Pending Jobs in Queue 113
3.3 Share Consumption 114
3.4 Quotas on Resource Usage 115
3.5 Resource Requests 115
4 Job Features Correlation Analysis 116
4.1 Spearman's Rank Correlation of Numerical Features 116
4.2 Regression-Based Correlation for All Features 118
5 Learning-Based Job Wait Time Estimators 119
5.1 Objectives and Performance Metrics 119
5.2 Job Wait Time Estimators 120
5.3 Experimental Evaluation 121
6 Applicability to Other Workloads 127
7 Conclusions and Future Work 130
References 131
A HPC Co-scheduler with Reinforcement Learning 133
1 Introduction 134
2 Background and Challenges 135
2.1 Resource Management with Reinforcement Learning 135
2.2 Challenges 136
2.3 The Adaptive Scheduling Architecture 136
3 A Co-scheduler Architecture and Algorithm 137
3.1 Architecture and Algorithm Overview 138
3.2 Algorithm Details 140
4 Evaluation 142
4.1 Computing System 142
4.2 Applications 143
4.3 Metrics 143
4.4 Workloads 143
4.5 Results 144
5 Discussion 147
6 Related Work 149
7 Conclusion 150
A Convergence of ASAX 151
References 153
Performance-Cost Optimization of Moldable Scientific Workflows 156
1 Introduction 156
2 Proposed Algorithm 158
2.1 Solution Encoding 159
2.2 Fitness Function 160
2.3 Cluster Simulator 162
3 Experimental Results 162
3.1 Investigated Moldable Workflows 162
3.2 Local Task Optimization of the Execution Time 164
3.3 Global Workflow Optimization of Execution Time and Cost 164
4 Conclusions 171
4.1 Future Work 172
References 172
Temperature-Aware Energy-Optimal Scheduling of Moldable Streaming Tasks onto 2D-Mesh-Based Many-Core CPUs with DVFS 175
1 Introduction 175
2 Related Work 178
3 Architecture and Application Model 179
3.1 Generic Multi-/Many-core Architecture with DVFS 181
3.2 Multi-variant Moldable Streaming Tasks 181
3.3 Scheduling for the Steady State 182
4 Temperature-Aware Crown Scheduling with Buddying 183
5 ILP Model with Fixed Buddy Cores 184
5.1 Time and Energy 184
5.2 ILP Solution Variables 184
5.3 Constraints 184
5.4 Objective Function 186
5.5 Temperature-Dependent Power Modeling 186
6 ILP Model with Arbitrary Buddies 187
7 Evaluation 190
8 Conclusion and Future Work 195
References 195
Scheduling Challenges for Variable Capacity Resources 197
1 Introduction 197
2 Scheduling Problem with Resource Capacity Variations 200
2.1 Scheduling Problem Definition 200
2.2 Challenges of Job Scheduling 201
2.3 Approach 202
3 Resource Capacity Variations from Empirical Traces 203
3.1 Variation from Price 203
3.2 Variation from Carbon Emissions 203
3.3 Variation from Stranded Power 204
4 Metrics 204
4.1 Capacity Variation 205
4.2 Performance 205
5 Example Studies of Variable Resource Capacity Data Centers 206
5.1 Experiment Methodology 206
5.2 Impact of Capacity Variation Dimensions 207
5.3 Scheduling Potential for Improvement 209
6 Further Directions and Opportunities 210
7 Related Work 212
8 Summary 213
References 214
GLUME: A Strategy for Reducing Workflow Execution Times on Batch-Scheduled Platforms 217
1 Introduction 217
2 Related Work 218
3 Problem Statement 218
4 Experimental Methodology 219
4.1 Workflow Configurations 220
4.2 Batch Scheduling and Workloads 220
5 The Algorithm by Zhang et al. 221
5.1 Overview 221
5.2 Evaluation Results 223
5.3 Discussion 225
6 Proposed Algorithm 226
6.1 Intuition and Overview 226
6.2 Detailed Description 227
7 Results 233
7.1 Results for –medium Workflows 233
7.2 Overall Results 234
8 Conclusion 235
References 236
Author Index 238
Organization 9
Contents 10
Keynote 12
Resampling with Feedback: A New Paradigm of Using Workload Data for Performance Evaluation 13
1 Introduction 13
2 Background 15
3 Using Workload Logs and Models to Drive Simulations 16
3.1 Workload Modeling 16
3.2 Problems with Models 17
3.3 Using Logs Directly 20
3.4 Drawbacks of Using Logs 21
4 Resampling and Feedback 25
4.1 Before Resampling: Input Shaking and User-Based Modeling 25
4.2 Resampling from a Log 26
4.3 Adding Feedback 29
4.4 Applications and Benefits 32
5 Conclusions 37
References 39
Open Scheduling Problems and Proposals 43
Collection of Job Scheduling Prediction Methods 44
1 Introduction 44
2 Background 45
3 Predictions 46
3.1 Job Workloads and Metadata 46
3.2 Metrics 48
3.3 Repository 49
4 Conclusion and Outlook 49
References 49
Modular Workload Format: Extending SWF for Modular Systems 52
1 Introduction and Motivation 52
2 Standard Workload Format 54
3 Heterogeneous Systems Requirements 56
4 Modular Workload Format (MWF) Proposal 56
4.1 Modular Workload Format Fields 57
4.2 Headers 59
5 From SWF to MWF 60
6 MWF Experiences 60
7 Conclusions 63
References 64
Technical Papers 65
Measurement and Modeling of Performance of HPC Applications Towards Overcommitting Scheduling Systems 66
1 Introduction 66
2 Background 67
2.1 Evaluation Metrics for Scheduling 67
2.2 Features of Interactive Jobs 68
3 Overcommitting on Supercomputer Systems 69
3.1 Node-Level Overcommitting 69
3.2 Processor Core-Level Overcommitting 69
3.3 Overcommitting Scheduling System 71
4 Methodology for Evaluation Under Overcommitting 72
4.1 Target Workload 73
4.2 Experiment Environment for Overcommitting 73
4.3 Definition of Performance Degradation Ratio 74
5 Performance Evaluation Under Overcommitting 76
5.1 Two Batch Jobs 76
5.2 A Batch Job and an Interactive Job 77
6 Performance Modeling with Overcommitting 79
6.1 Overview of Prediction Model 80
6.2 Phase 1: Dangerous Application Detection Model 80
6.3 Phase 2: Degradation Prediction Model 81
6.4 Validation of the Model 82
7 Related Work 83
8 Conclusion and Future Work 85
References 86
Scheduling Microservice Containers on Large Core Machines Through Placement and Coalescing 87
1 Introduction 87
2 Related Work 90
3 Implementation of TRACPAD 92
3.1 The Container Resource Utilization Model 92
3.2 Collection of Container Specific Data 93
3.3 Monitoring Network Traffic 93
3.4 Generation of the TRACPAD Partitioning Scheme 93
3.5 Dynamic Provisioning of Resources 94
4 Experimental Setup 94
5 Microservice Benchmarks and Workloads 95
5.1 The Social Network Application 95
6 Evaluation of the TRACPAD Scheduler 96
7 Analysis of TRACPAD 98
8 Factors Affecting the Scheduling Policy 100
8.1 Effect of Database Size on Scheduling Policy 100
8.2 Effect of Request Packet Size on Scheduling Policy 100
9 Container Coalescing 101
9.1 Design Considerations 101
9.2 Methodology 102
9.3 Experimental Setup 102
9.4 Results 102
10 Conclusion and Further Work 104
References 104
Learning-Based Approaches to Estimate Job Wait Time in HTC Datacenters 108
1 Introduction 108
2 Related Work 110
3 Wait Time Distribution and Intuitive Causes 110
3.1 Submission Period 113
3.2 Number of Pending Jobs in Queue 113
3.3 Share Consumption 114
3.4 Quotas on Resource Usage 115
3.5 Resource Requests 115
4 Job Features Correlation Analysis 116
4.1 Spearman's Rank Correlation of Numerical Features 116
4.2 Regression-Based Correlation for All Features 118
5 Learning-Based Job Wait Time Estimators 119
5.1 Objectives and Performance Metrics 119
5.2 Job Wait Time Estimators 120
5.3 Experimental Evaluation 121
6 Applicability to Other Workloads 127
7 Conclusions and Future Work 130
References 131
A HPC Co-scheduler with Reinforcement Learning 133
1 Introduction 134
2 Background and Challenges 135
2.1 Resource Management with Reinforcement Learning 135
2.2 Challenges 136
2.3 The Adaptive Scheduling Architecture 136
3 A Co-scheduler Architecture and Algorithm 137
3.1 Architecture and Algorithm Overview 138
3.2 Algorithm Details 140
4 Evaluation 142
4.1 Computing System 142
4.2 Applications 143
4.3 Metrics 143
4.4 Workloads 143
4.5 Results 144
5 Discussion 147
6 Related Work 149
7 Conclusion 150
A Convergence of ASAX 151
References 153
Performance-Cost Optimization of Moldable Scientific Workflows 156
1 Introduction 156
2 Proposed Algorithm 158
2.1 Solution Encoding 159
2.2 Fitness Function 160
2.3 Cluster Simulator 162
3 Experimental Results 162
3.1 Investigated Moldable Workflows 162
3.2 Local Task Optimization of the Execution Time 164
3.3 Global Workflow Optimization of Execution Time and Cost 164
4 Conclusions 171
4.1 Future Work 172
References 172
Temperature-Aware Energy-Optimal Scheduling of Moldable Streaming Tasks onto 2D-Mesh-Based Many-Core CPUs with DVFS 175
1 Introduction 175
2 Related Work 178
3 Architecture and Application Model 179
3.1 Generic Multi-/Many-core Architecture with DVFS 181
3.2 Multi-variant Moldable Streaming Tasks 181
3.3 Scheduling for the Steady State 182
4 Temperature-Aware Crown Scheduling with Buddying 183
5 ILP Model with Fixed Buddy Cores 184
5.1 Time and Energy 184
5.2 ILP Solution Variables 184
5.3 Constraints 184
5.4 Objective Function 186
5.5 Temperature-Dependent Power Modeling 186
6 ILP Model with Arbitrary Buddies 187
7 Evaluation 190
8 Conclusion and Future Work 195
References 195
Scheduling Challenges for Variable Capacity Resources 197
1 Introduction 197
2 Scheduling Problem with Resource Capacity Variations 200
2.1 Scheduling Problem Definition 200
2.2 Challenges of Job Scheduling 201
2.3 Approach 202
3 Resource Capacity Variations from Empirical Traces 203
3.1 Variation from Price 203
3.2 Variation from Carbon Emissions 203
3.3 Variation from Stranded Power 204
4 Metrics 204
4.1 Capacity Variation 205
4.2 Performance 205
5 Example Studies of Variable Resource Capacity Data Centers 206
5.1 Experiment Methodology 206
5.2 Impact of Capacity Variation Dimensions 207
5.3 Scheduling Potential for Improvement 209
6 Further Directions and Opportunities 210
7 Related Work 212
8 Summary 213
References 214
GLUME: A Strategy for Reducing Workflow Execution Times on Batch-Scheduled Platforms 217
1 Introduction 217
2 Related Work 218
3 Problem Statement 218
4 Experimental Methodology 219
4.1 Workflow Configurations 220
4.2 Batch Scheduling and Workloads 220
5 The Algorithm by Zhang et al. 221
5.1 Overview 221
5.2 Evaluation Results 223
5.3 Discussion 225
6 Proposed Algorithm 226
6.1 Intuition and Overview 226
6.2 Detailed Description 227
7 Results 233
7.1 Results for –medium Workflows 233
7.2 Overall Results 234
8 Conclusion 235
References 236
Author Index 238
开源日期
2024-03-30
🚀 快速下载
成为会员以支持书籍、论文等的长期保存。为了感谢您对我们的支持,您将获得高速下载权益。❤️
如果您在本月捐款,您将获得双倍的快速下载次数。
🐢 低速下载
由可信的合作方提供。 更多信息请参见常见问题解答。 (可能需要验证浏览器——无限次下载!)
- 低速服务器(合作方提供) #1 (稍快但需要排队)
- 低速服务器(合作方提供) #2 (稍快但需要排队)
- 低速服务器(合作方提供) #3 (稍快但需要排队)
- 低速服务器(合作方提供) #4 (稍快但需要排队)
- 低速服务器(合作方提供) #5 (无需排队,但可能非常慢)
- 低速服务器(合作方提供) #6 (无需排队,但可能非常慢)
- 低速服务器(合作方提供) #7 (无需排队,但可能非常慢)
- 低速服务器(合作方提供) #8 (无需排队,但可能非常慢)
- 低速服务器(合作方提供) #9 (无需排队,但可能非常慢)
- 下载后: 在我们的查看器中打开
所有选项下载的文件都相同,应该可以安全使用。即使这样,从互联网下载文件时始终要小心。例如,确保您的设备更新及时。
外部下载
-
对于大文件,我们建议使用下载管理器以防止中断。
推荐的下载管理器:JDownloader -
您将需要一个电子书或 PDF 阅读器来打开文件,具体取决于文件格式。
推荐的电子书阅读器:Anna的档案在线查看器、ReadEra和Calibre -
使用在线工具进行格式转换。
推荐的转换工具:CloudConvert和PrintFriendly -
您可以将 PDF 和 EPUB 文件发送到您的 Kindle 或 Kobo 电子阅读器。
推荐的工具:亚马逊的“发送到 Kindle”和djazz 的“发送到 Kobo/Kindle” -
支持作者和图书馆
✍️ 如果您喜欢这个并且能够负担得起,请考虑购买原版,或直接支持作者。
📚 如果您当地的图书馆有这本书,请考虑在那里免费借阅。
下面的文字仅以英文继续。
总下载量:
“文件的MD5”是根据文件内容计算出的哈希值,并且基于该内容具有相当的唯一性。我们这里索引的所有影子图书馆都主要使用MD5来标识文件。
一个文件可能会出现在多个影子图书馆中。有关我们编译的各种数据集的信息,请参见数据集页面。
有关此文件的详细信息,请查看其JSON 文件。 Live/debug JSON version. Live/debug page.