Group Pose: A Simple Baseline For End-to-End Multi-person Pose Estimation

Pricing Type

Pricing Type: Free
Price Range Start($):

GitHub Link

The GitHub link is https://github.com/michel-liu/grouppose-paddle

Introduce

This repository contains the official PaddlePaddle implementation for the ICCV 2023 paper titled “Group Pose A Simple Baseline for End-to-End Multi-person Pose Estimation.” The paper introduces Group Pose, a straightforward transformer-based approach for multi-person pose estimation, treating keypoint prediction as a set of queries. The method simplifies decoder self-attention by using specific group self-attentions instead of interactions between different query types. Experimental results on MS COCO and CrowdPose datasets demonstrate that Group Pose outperforms previous methods without human box supervision, even slightly surpassing ED-Pose, which uses such supervision. The repository provides code, pretrained models, and detailed results for evaluation. The work is released under the Apache 2.0 license.

State-of-the-art solutions adopt the DETR-like framework, and mainly develop the complex decoder, e. g., regarding pose estimation as keypoint box detection and combining with human detection in ED-Pose, hierarchically predicting with pose decoder and joint (keypoint) decoder in PETR.

Content

Introduction

In this paper, we study the end-to-end multi-person pose estimation and present a simple yet effective transformer approach, named Group Pose. We simply regard �-keypoint pose estimation as predicting a set of �×� keypoint positions, each from a keypoint query, as well as representing each pose with an instance query for scoring � pose predictions.

Motivated by the intuition that the interaction, among across-instance queries of different types, is not directly helpful, we make a simple modification to decoder self-attention. We replace single self-attention over all the �×(�+1) queries with two subsequent group self-attentions: (i) � within-instance self-attention, with each over � keypoint queries and one instance query, and (ii) (�+1) same-type across-instance self-attention, each over � queries of the same type. The resulting decoder removes the interaction among across-instance type-different queries, easing the optimization and thus improving the performance. Experimental results on MS COCO and CrowdPose show that our approach without human box supervision is superior to previous methods with complex decoders, and even is slightly better than ED-Pose that uses human box supervision.

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation

A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis, and Recommendations

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources.

Free Google Gemini: the best largest and most capable AI model

Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.

MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.

Bio-SIEVE: Exploring Instruction Tuning Large Language Models for Systematic Review Automation

Medical systematic reviews can be very costly and resource intensive.

Enhancing Generalization of Universal Adversarial Perturbation through Gradient Aggregation

Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples.

Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model

This paper presents an ensemble data assimilation method using the pseudo ensembles generated by denoising diffusion probabilistic model.

No comments yet, please leave the first one!

No comments...

Hot AI Books

The ChatGPT Millionaire: Making Money Online has never been this EASY

This is the simplest guide on how to make money quickly and easily with ChatGPT (Updated for GPT-4)

The GPT-4 Millionaire: Future of Business Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work

The GPT-4 MILLIONAIRE: FUTURE OF BUSINESS Featuring Microsoft 365 Copilot: How to Leverage AI Language Models to Grow Your Company and How AI-driven Language Models Will Revolutionize the Way We Work. Discover the transformative power of GPT-4, a state-of-the-art AI-driven language model, and its integration with Microsoft 365 Copilot

CHATGPT MONEY EXPLOSION UNCOVER THE SECRET AI WEAPON TO SKYROCKET YOUR INCOME: THE ULTIMATE GUIDE TO UNLEASHING THE FULL POTENTIAL OF CHATGPT FOR MASSIVE PROFITS

Revolutionize Your Income Streams with the Ultimate ChatGPT Guide Transform Your Business with AI-Powered Strategies and Unstoppable Profits

The ChatGPT-4 Billionaire: Making Bundles Of Money Online Was Not That Much Easy

In today's world, businesses are spending substantial amounts on content creation, social media marketing, and SEO. With ChatGPT, even if you lack experience, you can excel in these areas. Many businesses are not leveraging ChatGPT yet, creating an opportunity for you to offer similar services at a lower cost with minimal effort. I'll provide you with step-by-step instructions that you can easily replicate. While the market may become saturated in the future, now is the ideal time to get started!

The ChatGPT Millionaire: Easy Way to Make Money Online Using ChatGPT Effectively

This guide is the ultimate resource for making fast and easy money with ChatGPT, now updated for GPT-4.

The ChatGPT Millionaire Guide: How To Earn Money Online & Become A Millionaire Using ChatGPT Making Money Online has never been this EASY