Nowadays, detecting people and understanding their behaviour automatically is one of the key aspects of modern intelligent video systems. This interest arises from societal needs. Security and Video Analytics, Intelligent Retail Environment and Activities of Daily Living are just a few of the possible applications. The problem remains largely open due to several serious challenges such as occlusion, change of appearance, complex and dynamic background. Nevertheless, in recent years, privacy concerns are arising making these system designs more challenging, also to cope with different worldwide country regulations. Popular sensors for this task are RGB-D cameras because of their availability, reliability and affordability. Studies have demonstrated the great value (both in accuracy and efficiency) of depth camera in coping with severe occlusions among humans and complex background. In particular, RGB-D cameras show their great potential if used in a top-view configuration achieving high performances even in a crowded environment (considering at least 3 people per square meter in the area of the camera) minimizing occlusions and also being the most privacy-compliant approach. The first step in people detection and tracking is the segmentation to retrieve people silhouette, for this reason different methods will be covered in this chapter, ranging from classical handcraft feature based approaches to deep learning techniques. These techniques also solve the nontrivial problem of blob collision, occurring when two or more people are close enough to form a unique blob from the camera point of view. Multilevel segmentation and water filling algorithms will be presented to the reader in this chapter as handcraft feature based, in addition a deep learning approach is also introduced from the literature. In the methods presented in this chapter, the elaboration occurs live (there is no image recording) and occurs on the edge, following an IoT paradigm. Live analysis also strengthens the aforementioned concept of privacy compliance. The last part of this chapter is dedicated to person re-identification (re-id), which is the process to determine if different instances or images of the same person, recorded in different moments, belong to the same subject. Person re-id has many important applications in video surveillance, because it saves human efforts on exhaustively searching for a person from large amounts of video sequences. Identification cameras are widely employed in most of the public places like malls, office buildings, airports, stations and museums. These cameras generally provide enhanced coverage and overlay large geospatial areas because they have non-overlapping fields-of-views. Huge amounts of video data, monitored in real time by law enforcement officers are used after the event for forensic purposes, are provided by these networks. An automated analysis of these data improves significantly the quality of monitoring, in addition to processing the data faster. Handcrafted anthropomorphic features coupled with a machine learning approach will be exploited in this chapter, then a deep leaning approach in comparison is presented. Different metrics are then adopted to evaluate the above algorithms and to compare them.

People Counting in Crowded Environment and Re-identification

Frontoni E.;Paolanti M.;
2019-01-01

Abstract

Nowadays, detecting people and understanding their behaviour automatically is one of the key aspects of modern intelligent video systems. This interest arises from societal needs. Security and Video Analytics, Intelligent Retail Environment and Activities of Daily Living are just a few of the possible applications. The problem remains largely open due to several serious challenges such as occlusion, change of appearance, complex and dynamic background. Nevertheless, in recent years, privacy concerns are arising making these system designs more challenging, also to cope with different worldwide country regulations. Popular sensors for this task are RGB-D cameras because of their availability, reliability and affordability. Studies have demonstrated the great value (both in accuracy and efficiency) of depth camera in coping with severe occlusions among humans and complex background. In particular, RGB-D cameras show their great potential if used in a top-view configuration achieving high performances even in a crowded environment (considering at least 3 people per square meter in the area of the camera) minimizing occlusions and also being the most privacy-compliant approach. The first step in people detection and tracking is the segmentation to retrieve people silhouette, for this reason different methods will be covered in this chapter, ranging from classical handcraft feature based approaches to deep learning techniques. These techniques also solve the nontrivial problem of blob collision, occurring when two or more people are close enough to form a unique blob from the camera point of view. Multilevel segmentation and water filling algorithms will be presented to the reader in this chapter as handcraft feature based, in addition a deep learning approach is also introduced from the literature. In the methods presented in this chapter, the elaboration occurs live (there is no image recording) and occurs on the edge, following an IoT paradigm. Live analysis also strengthens the aforementioned concept of privacy compliance. The last part of this chapter is dedicated to person re-identification (re-id), which is the process to determine if different instances or images of the same person, recorded in different moments, belong to the same subject. Person re-id has many important applications in video surveillance, because it saves human efforts on exhaustively searching for a person from large amounts of video sequences. Identification cameras are widely employed in most of the public places like malls, office buildings, airports, stations and museums. These cameras generally provide enhanced coverage and overlay large geospatial areas because they have non-overlapping fields-of-views. Huge amounts of video data, monitored in real time by law enforcement officers are used after the event for forensic purposes, are provided by these networks. An automated analysis of these data improves significantly the quality of monitoring, in addition to processing the data faster. Handcrafted anthropomorphic features coupled with a machine learning approach will be exploited in this chapter, then a deep leaning approach in comparison is presented. Different metrics are then adopted to evaluate the above algorithms and to compare them.
2019
978-3-030-28602-6
978-3-030-28603-3
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11393/286707
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact