OpenGL Learning Notes (5) ---- coordinate system, camera

OpenGLStudy Notes (5) ---- coordinate system, camera

introductory
coordinate system
- doctrinal
- - local space
  - world space
  - Observation space
  - space for tailoring
  - - projection matrix
    - Projection method
    - division by fluoroscopy
  - screen space
- code implementation
- - model matrix
  - view matrix
  - projection matrix
  - Vertex Shader Code
camcorder
- doctrinal
- - LookAt matrix
- code implementation
- - mouse input
  - keypad input
  - Cameras

introductory

The last note learned about the OpenGL'sshaderThe basic manipulation of the texture, the use of textures, and the transformation matrix, I was able to draw a textured, moving 2D triangle. After this learning curve, I was able to create a 2D triangle by implementing thecoordinate systemDraw a 3D cube in 3D space to screen space, and also customize the camera class to move freely in 3D space.

coordinate system

Let's start with some theory.

doctrinal

The process of transforming coordinates to standardized device coordinates and then to screen coordinates is usually done in a step-by-step process, similar to an assembly line. In the pipeline, the object's vertices are transformed to multiple coordinate systems before they are finally converted to screen coordinates. Currently the more important coordinate systems are:

Local Space, or Object Space.
World Space
View Space, or Eye Space.
Clip Space
Screen Space

There are three important matrices in the process of transforming between these spaces: the model, the view, and the projection matrices.
在这里插入图片描述
The coordinates in this series of spaces are initially vec4, i.e., three positional coordinates and one chi-square coordinate.

local space

The coordinates in local space arelocal coordinateThe location coordinates are the coordinates of the model at the time of import, and the chi-square coordinates are usually set to 1.0. The origin of the local space is decided by the modeler at the time of modeling, and is usually at the center of the object.

world space

The world space is the space of the entire 3D world, and the coordinates in that space are called theworld coordinate。
The model coordinates need to be transformed (scaled, rotated, translated) to world coordinates, and this transformation operation is performed by themodel matrixrealization, the model matrix isModel RelatedThe.

Observation space

The observation space is also called Camera Space or Eye Space, and the coordinates of this phase are calledObservation coordinates。
The process of going from the world space to the observation space is to transform the coordinates of the objects relative to the world origin into coordinates relative to the camera, which is done by inverting the displacement and rotation operations of the camera relative to the world origin and adding them to the world coordinates of all the objects, this operation is theObservation MatrixDoing. The observation matrix isCamera RelatedThe.

space for tailoring

in anAt the end of the vertex shader runOpenGL expects all coordinates to fall within a certain range, and any points outside this range should be clipped. The clipped coordinates are ignored, so the rest of the coordinates become visible fragments on the screen. This is also known as the clipping space (Clip The origin of the name (Space).
To go from the observation space to the cropping space is to complete the mapping of the following two spaces (in perspective projection)
在这里插入图片描述
The mapping requires two processes, first passing the observation coordinates through theprojection matrixconvert intocrop coordinatesand then through thedivision by fluoroscopyTurn the clipping coordinates intoStandardized equipment coordinates(normalized device coordinates NDC)。As shown in the figure below
在这里插入图片描述
Crop coordinates are coordinates defined for ease of calculation.

projection matrix

Transferring from observation coordinates to cropping coordinates requires theprojection matrixTo operate, the projection matrix has two main parameters, the projection mode and the camera's viewing box (a flat truncated body Frustum). The projection matrix isCamera RelatedThe.

Projection method

There are two types of projection:

orthogonal projection

Orthographic flat truncated headers directly map all coordinates inside the flat truncated header to normalized device coordinates, with no change in the chi-square coordinates.
perspective drawing

The perspective projection maps the given range of flat-truncated head bodies into the cropping space, in addition to modifying the w-values (chi-square coordinates) of each vertex coordinate so that theThe further away from the observerThe larger the w-component of the vertex coordinates of the (The projection matrix ofSpecific mathematical methods）

division by fluoroscopy

When the perspective projectionThe coordinates resulting from matrix multiplication are the cropped coordinates, and OpenGL automatically performs an operation called perspective division to turn the cropped coordinates into normalized device coordinates:

(\begin{matrix} x / w \\ y / w \\ z / w \end{matrix})

o u t = ⎝ ⎛ x / w y / w z / w ⎠ ⎞

This operation makes the absolute value of the coordinates further away smaller, creating the effect below:
在这里插入图片描述

When projected orthogonally, the coordinates from matrix multiplication are directly normalized device coordinates without perspective division.

screen space

To transform normalized device coordinates to screen coordinates, OpenGL uses a process called Viewport Transform. The viewport transform transforms the cropped coordinates, which lie in the range -1.0 to 1.0, into the range of coordinates defined by the glViewport function. The final transformed coordinates are sent to the rasterizer, which converts them into a fragment.

code implementation

In the above spatial transformations, we need to provide the vertex shader with three matrices model, view, and projection, where the model matrix is model-dependent, and a model corresponds to a model matrix; the view and projection matrices are camera-dependent.
Formulas for coordinate conversions:
$V_{clip} = M_{projection} \cdot M_{view} \cdot M_{model} \cdot V_{local}$
After obtaining V_clipAfter that, OpenGL automatically performs perspective division and viewport transformation on the vertex coordinates.

model matrix

This matrix can be used to define the position, size, and rotation of the model itself using the knowledge from the previous section.

glm::mat4 model = glm::mat4(1.0f);
model = glm::rotate(model, glm::radians(-55.0f), glm::vec3(1.0f, 0.0f, 0.0f));

view matrix

The view matrix is the inverse of the camera position.

glm::mat4 view;
// Note that we're moving the matrix in the opposite direction of the scene we're moving.
view = glm::translate(view, glm::vec3(0.0f, 0.0f, -3.0f));

projection matrix

orthogonal projection

glm::ortho(0.0f, 800.0f, 0.0f, 600.0f, 0.1f, 100.0f);

perspective drawing

glm::perspective(glm::radians(45.0f), (float)window_width/(float)window_height, 0.1f, 100.0f);

The first parameter is the field of view value fov (Field of View)
在这里插入图片描述

Vertex Shader Code

#version 330 core
layout (location = 0) in vec3 aPos;
...
uniform mat4 model;
uniform mat4 view;
uniform mat4 projection;

void main()
{
    // Note that multiplication is read from right to left
    gl_Position = projection * view * model * vec4(aPos, 1.0);
    ...
}

camcorder

doctrinal

The idea behind the implementation of the Camera Camera in the 3D world is to change the camera's viewing position and the camera's viewing direction by capturing the user's keyboard and mouse inputs.
As mentioned above, I need to provide 3 matrices: model, veiw, projection to accomplish the drawing of the 3d world. The view and projection matrices are related to the camera. view is related to the camera's pose: position, rotation. projection is related to the camera's "lens": the field of view.
So all we have to do is map the user input to the camera changes.
Keyboard wasd Change camera position
Mouse over to change camera pose
Mouse wheel Changing the camera's field of view (zoom in and out)

LookAt matrix

The LookAt matrix can be multiplied by any vector to transform it to some coordinate space defined by a right-handed coordinate system consisting of R (the right vector), U (the up vector), D (the direction vector), three orthogonal vectors, and a position vector P. The LookAt matrix can be multiplied by any vector to transform it to some coordinate space:

[\begin{matrix} R_{x} & a m p; R_{y} & a m p; R_{z} & a m p; 0 \\ U_{x} & a m p; U_{y} & a m p; U_{z} & a m p; 0 \\ D_{x} & a m p; D_{y} & a m p; D_{z} & a m p; 0 \\ 0 & a m p; 0 & a m p; 0 & a m p; 1 \end{matrix}]

[\begin{matrix} 1 & a m p; 0 & a m p; 0 & a m p; - P_{x} \\ 0 & a m p; 1 & a m p; 0 & a m p; - P_{y} \\ 0 & a m p; 0 & a m p; 1 & a m p; - P_{z} \\ 0 & a m p; 0 & a m p; 0 & a m p; 1 \end{matrix}]

L o o k A t = ⎣ ⎢ ⎢ ⎡ R_{x} U_{x} D_{x} 0 R_{y} U_{y} D_{y} 0 R_{z} U_{z} D_{z} 0 0001 ⎦ ⎥ ⎥ ⎤ * ⎣ ⎢ ⎢ ⎡ 100001000010 - P_{x} - P_{y} - P_{z} 1 ⎦ ⎥ ⎥ ⎤

What LookAt does is actually what the view matrix is trying to accomplish, so we just need to monitor the camera's position vector P and the three seniority vectors R, U, and D in real time to generate a LookAt matrix of the camera's coordinate system, which will be passed to the fixed-point shader as the view matrix.

code implementation

To summarize what a camera class does: reads the user's input from keyboard WASD, mouse swipe, and mouse wheel, and outputs the LookAt matrix and fov values.

mouse input

How to send mouse actions to the camera class.
The first step is to hide the cursor and capture the cursor. Capturing the cursor means that if the focus is on your program, the cursor should stay in the window (unless the program loses focus or exits).

glfwSetInputMode(window, GLFW_CURSOR, GLFW_CURSOR_DISABLED);

Then listen for mouse movement events

void mouse_callback(GLFWwindow* window, double xpos, double ypos);
glfwSetCursorPosCallback(window, mouse_callback);

Handling of movement events

void mouse_callback(GLFWwindow* window, double xpos, double ypos){
    if(firstMouse){
        lastX = xpos;
        lastY = ypos;
        firstMouse = false;
    }

    float xoffset = xpos - lastX;
    float yoffset = lastY - ypos; // y's coordinates are small at the top and large at the bottom.
    lastX = xpos;
    lastY = ypos;
    
    camera.ProcessMouseMovement(xoffset, yoffset);
}

The scroll wheel input is similar

......
void scroll_callback(GLFWwindow* window, double xoffset, double yoffset);
......
int main(){
	......
	glfwSetScrollCallback(window, scroll_callback);
	......
}
......
void scroll_callback(GLFWwindow* window, double xoffset, double yoffset){
	// yoffset represents the value of the scroll wheel scrolling
    camera.ProcessMouseScroll(yoffset);
}

keypad input

Event capture during each frame of rendering:

void processInput(GLFWwindow *window){
	.......
    // Get the rendering time and make sure the speed is the same
    float currentFrame = glfwGetTime();
    deltaTime = currentFrame - lastFrame;
    lastFrame = currentFrame;
    // Keyboard movement
    if(glfwGetKey(window, GLFW_KEY_W) == GLFW_PRESS)
        camera.ProcessKeyboard(FORWARD, deltaTime);
    if(glfwGetKey(window, GLFW_KEY_S) == GLFW_PRESS)
        camera.ProcessKeyboard(BACKWARD, deltaTime);
    if(glfwGetKey(window, GLFW_KEY_A) == GLFW_PRESS)
        camera.ProcessKeyboard(LEFT, deltaTime);
    if(glfwGetKey(window, GLFW_KEY_D) == GLFW_PRESS)
        camera.ProcessKeyboard(RIGHT, deltaTime);
}

Cameras

Here are the properties and methods of the Camera class

#ifndef CAMERA_H
#define CAMERA_H

#include <glad/>
#include <glm/>
#include <glm/gtc/matrix_transform.hpp>

#include <vector>

// Defines several possible options for camera movement. Used as abstraction to stay away from window-system specific input methods
enum Camera_Movement {
    FORWARD,
    BACKWARD,
    LEFT,
    RIGHT
};

// Default camera values
const float YAW         = -90.0f;
const float PITCH       =  0.0f;
const float SPEED       =  2.5f;
const float SENSITIVITY =  0.1f;
const float ZOOM        =  45.0f;


// An abstract camera class that processes input and calculates the corresponding Euler Angles, Vectors and Matrices for use in OpenGL
class Camera{
public:
    // Camera Attributes
    glm::vec3 Position;
    glm::vec3 Front;
    glm::vec3 Up;
    glm::vec3 Right;
    glm::vec3 WorldUp;
    // Euler Angles
    float Yaw;
    float Pitch;
    // Camera options
    float MovementSpeed;
    float MouseSensitivity;
    float Zoom;

    // Constructor with vectors
    Camera(glm::vec3 position = glm::vec3(0.0f, 0.0f, 0.0f), glm::vec3 up = glm::vec3(0.0f, 1.0f, 0.0f), float yaw = YAW, float pitch = PITCH) : Front(glm::vec3(0.0f, 0.0f, -1.0f)), MovementSpeed(SPEED), MouseSensitivity(SENSITIVITY), Zoom(ZOOM){
        Position = position;
        WorldUp = up;
        Yaw = yaw;
        Pitch = pitch;
        updateCameraVectors();
    }
    // Constructor with scalar values
    Camera(float posX, float posY, float posZ, float upX, float upY, float upZ, float yaw, float pitch) : Front(glm::vec3(0.0f, 0.0f, -1.0f)), MovementSpeed(SPEED), MouseSensitivity(SENSITIVITY), Zoom(ZOOM){
        Position = glm::vec3(posX, posY, posZ);
        WorldUp = glm::vec3(upX, upY, upZ);
        Yaw = yaw;
        Pitch = pitch;
        updateCameraVectors();
    }

    // Returns the view matrix calculated using Euler Angles and the LookAt Matrix
    glm::mat4 GetViewMatrix();

    // Processes input received from any keyboard-like input system. Accepts input parameter in the form of camera defined ENUM (to abstract it from windowing systems)
    void ProcessKeyboard(Camera_Movement direction, float deltaTime);

    // Processes input received from a mouse input system. Expects the offset value in both the x and y direction.
    void ProcessMouseMovement(float xoffset, float yoffset, GLboolean constrainPitch = true);

    // Processes input received from a mouse scroll-wheel event. Only requires input on the vertical wheel-axis
    void ProcessMouseScroll(float yoffset);

private:
    // Calculates the front vector from the Camera's (updated) Euler Angles
    void updateCameraVectors();
};
#endif

The first thing the camera does is to accept inputs and maintain its attitude state:

    void ProcessKeyboard(Camera_Movement direction, float deltaTime){
        float velocity = MovementSpeed * deltaTime;
        if (direction == FORWARD)
            Position += Front * velocity;
        if (direction == BACKWARD)
            Position -= Front * velocity;
        if (direction == LEFT)
            Position -= Right * velocity;
        if (direction == RIGHT)
            Position += Right * velocity;
    }

    void ProcessMouseMovement(float xoffset, float yoffset, GLboolean constrainPitch = true){
        xoffset *= MouseSensitivity;
        yoffset *= MouseSensitivity;

        Yaw   += xoffset;
        Pitch += yoffset;

        // Make sure that when pitch is out of bounds, screen doesn't get flipped
        if (constrainPitch){
            if (Pitch > 89.0f)
                Pitch = 89.0f;
            if (Pitch < -89.0f)
                Pitch = -89.0f;
        }

        // Update Front, Right and Up Vectors using the updated Euler angles
        updateCameraVectors();
    }

    void ProcessMouseScroll(float yoffset){
        if (Zoom >= 1.0f && Zoom <= 45.0f)
            Zoom -= yoffset;
        if (Zoom <= 1.0f)
            Zoom = 1.0f;
        if (Zoom >= 45.0f)
            Zoom = 45.0f;
    }

The main rotational attitude of the camera is by maintaining two angles Pitch, Yaw and updating the vectors synchronously.

    void updateCameraVectors(){
        // Calculate the new Front vector
        glm::vec3 front;
        front.x = cos(glm::radians(Yaw)) * cos(glm::radians(Pitch));
        front.y = sin(glm::radians(Pitch));
        front.z = sin(glm::radians(Yaw)) * cos(glm::radians(Pitch));
        Front = glm::normalize(front);
        // Also re-calculate the Right and Up vector
        Right = glm::normalize(glm::cross(Front, WorldUp));  // Normalize the vectors, because their length gets closer to 0 the more you look up or down which results in slower movement.
        Up    = glm::normalize(glm::cross(Right, Front));
    }

The LookAt matrix can be generated directly with the functions provided by glm

    glm::mat4 GetViewMatrix(){
        return glm::lookAt(Position, Position + Front, Up);
    }

The glm::LookAt function takes a position, target and up vector.
So the Camera class on the whole, in the main function each time the rendering of the real-time LookAt matrix and zoom value, generate the corresponding view matrix and projection matrix can be realized on the movement of the camera.
You can refer to mycoding

The ideas in this article and the diagrams that appear are from