Hello, I'm Ron Ferens

I am an algorithm engineer specializing in machine-learning and deep-learning applications for computer vision and image processing. My work spans 2D and 3D domains, addressing various needs like hand and body tracking, expression recognition, large-scale visual localization, and eyes and gaze tracking. My academic focus revolves around absolute camera pose estimation, where I explore end-to-end learning methods. I am passionate about leveraging algorithms to solve real-world problems and contribute to the evolving landscape of computer vision.


Publications

HyperPose: Hypernetwork-Infused Camera Pose Localization and an Extended Cambridge Landmarks Dataset

HyperPose: Hypernetwork-Infused Camera Pose Localization and an Extended Cambridge Landmarks Dataset

Ron Ferens, Yosi Keller
CVPR, 2025 Accepted to The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2025

We advocate for incorporating hypernetworks into single-scene and multiscene camera pose regression models.

Coarse-to-Fine Multi-Scene Pose Regression with Transformers

Coarse-to-Fine Multi-Scene Pose Regression with Transformers

Yoli Shavit, Ron Ferens, Yosi Keller
TPAMI, 2023 Publication in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

We extend our previous MSTransformer approach by introducing a mixed classification-regression architecture that improves the localization accuracy.

Learning Multi-Scene Absolute Pose Regression with Transformers

Learning Multi-Scene Absolute Pose Regression with Transformers

Yoli Shavit, Ron Ferens, Yosi Keller
ICCV, 2021 Oral presentation at ICCV 2021

We propose to learn multi-scene absolute camera pose regression with Transformers, where encoders are used to aggregate activation maps with self-attention and decoders transform latent features and scenes encoding into candidate pose predictions.

Learning single and multi-scene camera pose regression with transformer encoders

Learning single and multi-scene camera pose regression with transformer encoders

Yoli Shavit, Ron Ferens, Yosi Keller
CVIU, 2024 Publication in Computer Vision and Image Understanding (CVIU)

We propose an attention-based approach for pose regression, where the convolutional activation maps are used as sequential inputs.

Do We Really Need Scene-specific Pose Encoders?

Do We Really Need Scene-specific Pose Encoders?

Yoli Shavit, Ron Ferens
ICPR, 2020 Oral presentation at International Conference on Pattern Recognition (ICPR) 2020

We propose that scene-specific pose encoders are not required for pose regression and that encodings trained for visual similarity can be used instead.

Introduction to Camera Pose Estimation with Deep Learning

Introduction to Camera Pose Estimation with Deep Learning

Yoli Shavit, Ron Ferens
arXiv, 2019

We review deep learning approaches for camera pose estimation. We describe key methods in the field and identify trends aiming at improving the original deep pose regression solution. We further provide an extensive cross-comparison of existing learning-based pose estimators, together with practical notes on their execution for reproducibility purposes. Finally, we discuss emerging solutions and potential future research directions.