Search this site
Embedded Files
Andrik Puentes
  • Home
    • About me!
    • CV
  • Research
    • TRUSSES - White Sands
    • TRUSSES - NASA AMES
    • Publications
  • Projects
    • Robotics
    • Coding
    • Automotive
    • Personal / Hobby
  • Internships
  • NASA Senior Capstone @ URI
  • Formula SAE @ URI
  • Organizations & Involvement
Andrik Puentes
  • Home
    • About me!
    • CV
  • Research
    • TRUSSES - White Sands
    • TRUSSES - NASA AMES
    • Publications
  • Projects
    • Robotics
    • Coding
    • Automotive
    • Personal / Hobby
  • Internships
  • NASA Senior Capstone @ URI
  • Formula SAE @ URI
  • Organizations & Involvement
  • More
    • Home
      • About me!
      • CV
    • Research
      • TRUSSES - White Sands
      • TRUSSES - NASA AMES
      • Publications
    • Projects
      • Robotics
      • Coding
      • Automotive
      • Personal / Hobby
    • Internships
    • NASA Senior Capstone @ URI
    • Formula SAE @ URI
    • Organizations & Involvement

Coding | github.com/andrikp-upenn

VID-20260320-WA0002.mp4

Autonomous Quadrotor: SE(3) Control, Trajectory Optimization & Path Planning (UPenn) 

Autonomous Quadrotor: SE(3) Control, Trajectory Optimization & Path Planning

University of Pennsylvania — MEAM 6200

For this project, I developed and deployed a fully autonomous quadrotor navigation system on a Crazyflie platform, implementing the complete stack from trajectory planning through real-time flight control. The goal was to enable autonomous maze navigation using only onboard sensing and VICON motion capture for state estimation, with no human intervention during flight.

Key Features:

1. SE(3) Geometric Flight Controller

  • Implemented a hierarchical position/attitude controller using SE(3) geometric control: an outer-loop PD controller computes desired accelerations from position/velocity error, which are converted to desired thrust and attitude via a geometric formulation

  • Inner-loop attitude controller tracks desired orientation and outputs thrust and torque commands using attitude error e_R and angular velocity error.

  • Tuned control gains Kp, Kd through Ziegler-Nichols-style initialization followed by iterative hardware tuning

2. Minimum-Snap Trajectory Generator

  • Generated smooth, dynamically feasible trajectories using piecewise quintic polynomial segments with continuity enforced through position, velocity, and acceleration at waypoints

  • Allocated segment times proportionally to inter-waypoint distance with a minimum time constraint to prevent aggressive short-segment motion

  • Applied Ramer-Douglas-Peucker (RDP) simplification to reduce dense A* waypoint sequences while preserving path shape and collision safety

3. A Graph-Search Path Planner

  • Formulated path planning as a shortest-path problem over a collision-free voxel graph with obstacle inflation margin m = 0.20m

  • Implemented A* search to find minimum-cost paths from start to goal, followed by RDP waypoint simplification with tolerance epsilon = 0.10m and collision re-verification on simplified segments

  • Tuned planner parameters: map resolution [0.1, 0.1, 0.1]m, nominal speed v = 0.75m/s, minimum segment time Tmin = 1.0s

Results

  • Successfully navigated all three maze environments autonomously in hardware deployment using VICON motion capture for state estimation

  • Achieved position RMSE < 0.21 m across all runs; peak z-axis tracking error reduced from 0.35 m to 0.17 m (47% reduction) through targeted Kd gain tuning

  • Managed sim-to-real transfer by reducing all gains to approximately 30% of simulation values to account for sensor noise, communication latency, and actuator limits

VID-20260402-WA0000.mp4
VID-20260320-WA0000.mp4

Artificial Intelligence Reinforcement Learning  - PACMAN

Pacman Q-Learning Agent - University of Pennsylvania

For this project, I developed and trained a Reinforcement Learning (RL) agent to play Pacman using Q-learning and Approximate Q-learning. The goal was to enable Pacman to learn an optimal policy through trial-and-error interactions with the game environment, maximizing rewards while avoiding ghosts.

Key Features:

1. Q-Learning Agent Implementation

  • Designed a Q-learning agent that updates a Q-table based on state-action pairs, learning from rewards to optimize future decisions.

  • Implemented ε-greedy action selection, balancing exploration and exploitation for effective learning.

  • Optimized Q-value updates using the Bellman equation: 

Q(s,a)=(1−α)Q(s,a)+α(R+γV(s′))

2. Approximate Q-Learning for Generalization

  • Extended the Q-learning agent to an Approximate Q-learning agent using feature-based function approximation instead of a discrete Q-table.

  • Implemented feature extraction to represent the game state, improving learning efficiency for larger environments.

  • Used a weighted sum of features approach to compute Q-values: Q(s,a)=∑f(s,a)wi

    1. ​ where each weight w_i is updated using: wi←wi+α⋅(R+γV(s′)−Q(s,a))⋅fi(s,a)

3. Training and Performance Optimization

  • Trained Pacman for 2000 episodes in a noiseless environment to refine its policy.

  • Implemented a state-value function to prioritize high-reward paths and avoid negative states (ghosts).

  • Improved policy stability by introducing a learning rate decay mechanism.

Results

  • The Q-learning agent initially struggled with random exploration but gradually converged to an optimal policy after hundreds of episodes.

  • The Approximate Q-learning agent, leveraging feature extraction, generalized better and was able to win consistently with fewer training episodes.

  • Final performance: Pacman achieved an 80-90% win rate on small grids and successfully adapted to larger environments using feature-based learning.


Machine Perception

Machine Learning

Get in touch at [andrikp@seas.upenn.edu]

Google Sites
Report abuse
Page details
Page updated
Google Sites
Report abuse