## CSE 565 spring 2021 03 15 Camera geometry, intrinsic vs extrinsic camera parameters

Mar 31, 2022

Hello everybody in today's lecture we will we will continue our camera geometry discussion we will finish it up and then we will start talking about the processing the images that we have produced with this camera last time we talked about last time we .

Talked about the homogeneous coordinate system and it is usage we said that with the help of the homogeneous coordinate systems we can now represent points on the at the infinity okay and we could use we can talk about the points at .

Infinity like the regular points and in our equations such as the homography equations we could use them with no problem and this was one of the examples of this was one of the examples of how we use the homography and i think last time i .

I told you that four points are enough to find the homography to find the homography between these two images that's what i said but the the points that we have chosen are the regular points but we could have done this we could have done this let's say i have two parallel lines let me use the let me use the .

Green color this is a line parallel line right and this is another line and i know that these will if i continue them this will intersect right in reality they should not intersect because they are they are they are parallel in real world right they should not intersect .

But in this one i think it's going to if i continue let me exaggerate a little bit it's going to intersect here okay so the corresponding point i just did the just the opposite okay the corresponding point of this part here is going to be .

The intersection of these let me use blue color for that one okay intersection of these two blue lines well these two blue lines will not intersect because these are parallel that's what i'm trying to do but using the homogeneous geometry homogeneous coordinate system .

I can find the intersection point at infinity and i could say that the intersection point of these two blue lines and the intersection point of these two green lines they correspond to each other so as one of my correspondences i could use .

That one so it is going to be something something zero right a comma b comma 0 is the intersection point of this one and i can use that in my in my equations here okay so points at infinity .

Could be used for establishing homography and your homework actually includes that one i didn't assign your homework yet i have written it after the class i will i will post it okay so uh more examples of homography your your homework is about this one okay so if you have images two images .

Of the same planar surface there is a homography between this plane and this plane also there is a homography between this plane and that plane okay so this is the image plane this is the image plane i have two image planes because i have taken two pictures .

And there is one image there is one planet surface in the real world so the one homography h1 okay we'll transform it from this space to this space h2 will transform you from this space to this space okay to get the the homography between .

These two what do i do on image one okay i have a point i image one i have a point to transform this point from image one to the planar surface coordinates i would multiply that point let's say this point x x here right so x times .

H1 would give me the capital x okay and then if i use h2 h2 times capital x would give me x prime which is this one right but if i like to find if i like to find a homography between image one to image two what would i do .

Is anybody talking because is my speakers off we have to find the transformation of homography between image one and image two that's what i am asking how do you find it if you if you can run just the fine homography of opencv that's fine but there is a .

Much better way of finding it i like to find h3 between these two how do i find it we can just multiply h1 and h2 exactly what is x it is x times h1 right i mean this one so what is x prime then h2 times h1 dot times x what is the size of h1 and h2 .

What is the size of h1 as a matrix what is the size of h1 3 3 by 3 and this one is 3 by 3 2. if you multiply 2 3 by three matrices you're gonna get another three by three matrix so if i say if i say h3 is okay h3 is h2 times h1 .

Then then to get x prime is h3 times capital x okay so that's the beauty of using homogeneous coordinates it makes this kind of stuff a lot easier okay so remember this is a planar surface okay this is not this is not i mean .

All the all the points are on the same surface plane plane planet surface they are not freely moving right if they move they move on the same surface this kind of homography is not valid for arbitrary points like there is a point there there is a point there there is a .

Point there i cannot find the homography using this kind of points between these two images this tomography is only valid if there is a planet surface like that okay for example there is a homography between this image and that image that's fine because these are two these are two planes if i know the .

Correspondences between the points of these images i am going to find it no problem and the same thing is here okay i know that they each correspond to each other okay also there is a homography between this shadow because it's on a plane and this side of the building okay there .

Is a homography between those two but homography doesn't homograph for the homography to exist you need a plane okay without the plane there is no homography and we are going to say this in in our stereo chapter okay so last time we looked at this we said that .

Using the homogeneous coordinate systems we can handle the perspective projection if i multiply this real-world three-dimensional point of course this is a three-dimensional point but it is expressed in four dimensions because this is in homogeneous coordinates if i multiply this .

With this matrix very simple matrix i would get this three-dimensional point in three dimensions three-dimensional vector in two-dimensional world homogeneous world okay so this is in 3d three-dimensional world this is in two-dimensional world this is in homogeneous coordinates to .

Convert homogeneous coordinates to the non-homogeneous coordinates what do i do i divide x by minus c over d this is what i get and this is what i get and if you remember this this is what we have this is what we our derivations our derivations uh produce this .

Before using the using the similarity of the triangles and there is no single way of achieving this in fact i can use a four by four four by four matrix with this one okay okay i what i input is a three-dimensional homogeneous coordinate what the output is the three dimensional .

Of this coordinate but if i'd like to get the non-homogeneous coordinates what would i get i would get x and y this z is going to be ignored because z is is is lost with the projection uh transformation even if i do that if i divide .

Z by this one all i would get is just d okay that's t whatever you do it is independent of z okay so homogeneous coordinates are very useful very useful with projection matrices okay so perspective projection uh is ended this way don't forget this .

One and we will use different versions of this all over okay so this is a perspective projection there is another projection type it's called orthographic projection or technique projection it just loses the one of the components .

Just see just lose it this is what we do one one okay one zero zero zero zero one zero zero zero zero zero one multiply the three dimen
sional homogeneous coordinate and i would get this one and if i divide the x and y by 1 i would get this so z is lost .

Okay this is called parallel projection what is the difference between perspective projection and parallel projection this is where is my perspective projection ama okay yeah this is perspective projection this is another example of project perspective projection .

So what is the what is the difference in effect between orthographic projection and prospective projection how would you okay if i show you two images it has to go some kind of projection right how would you know if it's an orthographic projection or the .

Prospective projection the parallel lines stay the same with orthographic projection parallel lines stay parallel right and we could we could say that why because in orthographic projection z is ignored okay z is ignored there is no z component in the .

Resulting system but with this one you divide the things by z if z is very large then your x and y becomes very very small right with this one we don't have it okay with this one we don't have it so that's that's one way to understand if the parallel lines are there or not .

Okay so there is a special lens that kind of produces the orthographic images we call them telecentric lenses okay look at the scene okay if i do this if i take this using a conventional perspective projection lens this would be the image that i see right so what do you see how do you know that .

This is a perspective projection salmon your thing doesn't work very well here right i mean all the parallels are parallel again i mean this red this part is parallel to this one this one is parallel to this and green is like that right and blue again parallels are parallel .

But their extensions will eventually meet right that's the lines you draw when you put them up they should eventually no well this is perfectly parallel it is not going to intersect right one of them should meet no i mean green red blue they won't .

Then how about the black ones right yeah how about the black ones black ones are going to even though they are parallel i know that right this one they are going to intersect right parallel lines are going to interesting so this is perspective and well i don't care about the parallel .

Lines but all i care is this i know that in real world the size of the red box and the blue box are the same but on my image they are different that's not good if i am doing an inspection if i am measuring their sizes and if i if i like to know that if they are all each five millimeters high and five .

Millimeters wide then this is not used right i mean my measurements depend on the length but if i use a if i use a telecentric glance this is what i am going to get okay doesn't matter how far your blue is it will be the same size as the red .

This is called telecentric lens and can i click on this one i like to show you one good example or from the telecentric lens pin inspection pin inspection images they are usually used for inspecting the yeah this is good yes okay this one .

If i take let's say i have a vlsi chip i have a vlsi chip like this one and i like to find cases like this see this is this is a bent this is not in right angle to the board okay so if i take a picture of .

This chip using a prospective camera this is what i'm going to get okay the the the the distance the distance between the tops of these pins are larger and bottoms are bottoms are smaller so i cannot do much inspection but if i use a but if i use a orthographic .

Projection lens like or telecentric lens this is what i'm going to see and i can definitely find this okay so with this kind of for this kind of cases okay orthographic projection is important and i can handle it because i know how to do orthographic projection but usually this is our projection .

Matrix given the three-dimensional position of a point these will be there these will be the corresponding point on my image plane okay any questions so far okay if there are no questions i will continue then um let me skip this one .

Okay let's go back to our previous uh discussion let's go back to the formulas doesn't matter if i am using d or if my center of projection is in front of the image plane or not okay any point x y z on the real world will be projected to x prime y prime and z prime on my image .

Plane pi pi prime okay this is my image plane pi prime good okay so in this case i actually don't like this i mean i don't want to talk about this z prime because z what is what is z prime z prime is constant what is its value .

What is the value of z prime here okay nobody know what the answer no no no no don't say anything don't say anything i will pick somebody alphabet what is the value of z prime so what is it one no no it is not but i mean why did you say one because it's a .

Normalized normalized distance do you assume no it is not one you are close but it is not one look at the picture and you you can tell me what is the coordinate of pi prime i mean pi prime is this plane x and y values change but the z is constant what is z i am not saying you .

Should know i am just asking you to look at the picture it dries in the picture it is in the picture no val z is what did i say this z prime is constant when you say minus c then it wouldn't be constant right yeah x prime and y prime are not .

Constant that's why i have formulas for them but for z prime i don't have a formula because it is constant what is the value of z prime punish can you tell me about shine can it be zero well if it is zero then it would be here right .

This is zero zero zero no it is not zero so maybe can you tell me what is the position of c prime then what is the position of z prime here f prime well it has three coordinates it has three coordinates right yes .

Zero zero and f prime exactly zero zero f prime so if c prime is zero zero f prime what would be the z prime of any point on that plane f prime f prime exactly so z prime is f prime this one focal length okay since i am projecting these three dimensional points .

On this image plane okay three dimensional points on this image plane i am going to lose one of the dimensions z is going to be lost x and y are going to change and these are the formulas and z is going to be a constant that's called why th .

That's why we call it projection okay we lose one of the values one of the um components so we are losing a lot of data when we take pictures of the real world objects okay and orthographic projection says that doesn't matter what your z is my x and y x prime and y prime okay .

Are going to stay the same perspective projection says that no the position on your image plane of a point depends on their z although i'm going to lose that z information here okay the position of x and y will depend on z okay so this f prime is very important for us .

It will be very critical parameter of our camera very critical parameter of our camera in determining this x prime and y prime numbers okay so another thing here here is this i like these two formulas x prime and y prime but they are in millimeters they are in .

Real world coordinates on an image i like to talk about columns and rows okay if this is my image okay if this is my image this is column number one column number two three four and this is row number one row number two three four .

Five and if i like to talk about a point i don't wanna talk about 5.2 millimeters or something like that i mean this point this point for me in the real world it is 1.7 millimeters 2.5 millimeters and let's say our f is 100 and 100 millimeters minus 100 okay i don't care where this point leaves in the real world all i like to .

Know is this point is on my image fifth column and fifth row that's it x and y zero sor
ry one two three four one two three four five sorry four and five okay because on my image that's that's what i do on my image i i care about my columns .

And my uh rows so how do i okay this one is produced by okay this coordinate is produced by this these two x prime y prime formulas how am i gonna go from this from this this these numbers to this number what do i need .

I know that this is 0 0 0 right and this one is here this point is 0 0 minus 100 right i know that one too so how would i go from this position to this one anybody has any ideas it is not difficult .

We need to know something the here of uh p or c prime p prime o and c player it is it is just right angle there you go it is just right angle i mean the angle between this x optical x and the pi prime is 90 degrees i know that .

What else do i need to know how how many rows are there in this image here how many rows and how many columns are there how would i know that what is number one right we need to know the range what do you mean are the range what is range .

Just explain that i mean minimum and maximum values of x and y so what determines that it is our sensor size right yeah what is our sensor size sensor size is important that's one thing the other thing is what is what is the size of each what is the height of each row .

What is this height h also what is this width w right each pixel should have a height and draw a width right so i need that and even though it's going to be very very small if i know that okay if i know that if i know that by dividing this width with this number and height with this .

Number i can get the row and column of the pixel that this value belongs to okay so these are very important parameters one is the focal length odakuza in turkish the other two is the width and the height of the pixels sensors on my on my real cmos sensor or the ccd .

Sensor okay the these are the other two things if i put them in my equations if i put them in my equations it will be clear so there is a real world point this point is projected through my pinhole that's what my assumption is even though .

I am using the lenses okay this one will hit my physical written or the sensor sensor array this one has many rows and many columns okay and i know my focal length my focal length is this much this is my focal length okay f that's my focal length .

Okay so c zero is f away from zero to make things a little bit more understandable i will assume that there is a normalized image plane although i don't have such a physical plane at all i am going to assume it in this physical plane okay .

It is just 1.0 away from the origin 1.0 away from the origin and everything is perfect on this one so since this one is 1.0 away from the origin f i can assume i can assume that f is 1.0 well in this case this hat is going to be let's say this one is .

X y and z since this is just 1.0 of a remember the formulas x prime is x over z times f prime okay so this is going to be just x over z p hat right because f is 1.0 i'm sorry sorry this is comma y over z .

That's the coordinate but such a plane does not exist i like this plane because i i assume that f is f is 1.0 but then but then this plane doesn't have rows and columns so the p hat lives in the millimeter world okay it doesn't talk about .

It doesn't talk about the rows and columns this one is the real world actually this one is the ccd sensor this one is f a from the ffa from the center of projection also this one has rows and columns okay rows and columns so i will assume that i know the .

The the the image pixel height and image pixel width okay with this one let's write a formula to get the x and y position of this point p this point p corresponds to the p prime p p hat sorry p hat this is normalized image plane this is a physical ccd .

Sensor okay so this is the formula actually u hat and v hat it corresponds to my p hat okay x over z and y over z to get p hat remember my point p x y z and i put a one this is in the homogeneous coordinates what is the size of this vector .

What is the size of the vector that i framed with the blue line okay denis jean adam hilmers dennis can you tell me the size of this vector these questions are so simple i think most of the people are kind of find it boring to answer maybe but i have to ask .

Denis jean are you there okay there is john are you there no i don't i cannot read the chat i don't see chat can i see it no i don't i don't i don't if you don't have a microphone denis you are not supposed to be on this lecture .

So what is your answer people if you don't have a microphone just don't attend the class then so we are waiting for denis john to write something and somebody will read it uh he says the size of the p1 vector is p square plus one no no no this is a .

Very simple this is a very simple question we don't have okay baris can you tell me what is the size of this this this vector here p i guess p p is this p is a vector x y z right if i put a one just underneath it what would i get .

Four dimensional vectors in homogeneous coordinates uh to multiply four why do you multiply it with two p is a vector of size three x y z and i put a okay p is this p is this okay if i do this p one it means x y .

Z one four-dimensional vector that represents three-dimensional point in homogeneous coordinates okay so don't think too complicated i'm just trying to make you guys uh follow the lecture okay so this is four dimensional vectors i thought .

I thought that id answer the whole question okay i will come to that in a few minutes she there yeah that's good okay somebody was asking a question uh so why john's answer is incorrect what what did you ask again why why what is incorrect .

Uh john said please do p square plus one y is incorrect because that is that is i don't wh why p what is p square p is a vector right three-dimensional vector it's simple vector length calculation if i if i put .

A one under x y z how would you calculate the size of this vector you just add one what is p square p square is p times p so all right i'm mistaken i understood so don't think this very complicated thing don't think this is a complicated question .

Just i am adding just one i am making this inhomogeneous coordinate a homogeneous coordinate so what i am expecting you from okay what i'm expecting expecting from you is you say that we are making it in homogeneous coordinates so it is just .

From two to from three to four that's it this is an identity matrix and this is 0 0 so this is a matrix so tell me the size of this matrix now since this is 4 dimensional right this is four dimensional it has to have four columns right three it has to have four columns so it's going to be one .

One one zero zero zero zero zero zero and all zeros here okay and this is x y z one so what did i do with this one i what i did is it is just perspective projection that's it .

Perspective projection like i did before perspective projection remember the perspective projection uh matrix here okay um here okay it was like like it is exactly the same as this right 1 1 1 0 0 0 but you will say that where is d well since this is normalized image .

Plane remember we assume d is as 1.0 so d is gone so what i get is okay this is one again okay that's what i did that's what i did oh i'm working on this still i am still working on this okay so one over z if i do the uh homogeneous to non-homogeneous system .

I have to divide it with z i will get p prime this is what i get if sorry p hat p hat2 has two components u hat and v hat the the coordinates of p hat is x over z and y over z okay so this one we already know let's try to find how we go from p hat to p p is needs to be expressed in terms of .

In terms of i
mage coordinates row number and column number p hat is in millimeters again in real world coordinates if this is 0 0 0 in terms of millimeters okay this is 0 0 0 0 1 and this is x over z y over z comma one okay .

If we switched again the coordinate system z is growing this way this is all positive okay uh all the uh real world points are in there negative space okay i'm trying to relate this one with this one okay so i'm going to introduce this i i now i i need three things one is .

This focal length f the other one is the the height and the width of the height and the width of the each pixel this is my pixel right three numbers are there okay so the book says that .

I am going to use these three parameters k l and f k l and are the widths and the heights of the pixel instead of w and h it uses k l n okay but they say that k and l are indepen are not k l and f are actually are not independent they .

Can be expressed in terms of each other so i will drop them to i will i will reduce them to two numbers alpha and beta k times f is alpha l times f is better okay so the the width of a pixel times focal length is alpha focal length times height is better if i do that .

If i do that to get the u and v from u hat and v hat this would be the formula x over z times alpha y over z times beta then i have u zero and y zero v zero sorry can you tell me about s u zero and v zero u zero and v zero .

Is it focal right focal length is f and we factored f into alpha and beta remember alpha is k times f beta is l times f what is u zero and v zero look at the picture here picture is important maybe unit vectors if it is unit vector then one point one and one no it is not .

Doesn't make much sense well i mean remember what did i say i said that i said that we need to factor into the the width and height of the uh the pixel measurements that's fine i did that but then i introduced this u0 and v0 the thing is here what is the point what .

Is the position of c 0 c 0 is 0 comma 0 comma f right that's the c 0 okay so this is the origin of this plane 0 0 f but the origin of my image is not in the center origin of my f is either here or there i start from the corners right .

So the coordinate of c0 is important i need to find okay the column number and the row number of this zero in terms of pixels so that column numbers u0 and v0 corresponds to the pixel coordinate of this image center this is called the image center so u0 okay for me for my image what is .

Zero zero zero zero is here right this is my column number zero row number zero right but in in terms of millimeters in the real world coordinates in the real world coordinates c0 is c0 is different okay so i need to find the difference between these two so this much this is going to be my .

U0 and this is going to be my v0 okay that's called the image center u0 and v0 so i have introduced two more parameters for a given camera i need to know the image center u0 and v0 and i'll use also i need to know alpha and beta if i know those then given the real .

World point x y and z i multiply it with alpha and beta and add u zero and v zero and i would find u and v okay so this would be the formula again this is just a formula this is not a linear matrix multiplication but i'm going to .

Show you how we do that in a few seconds in a few seconds okay before we move on there is another parameter that we don't use all the time okay what is the what is the angle between these two well i kind of draw it but what is the angle between u hat and v .

Hat 90 degrees yeah this is their orthogonal right it is 90 degrees so what should be the angle between these two u and v i would expect this to be 90 degrees but in the real world since this is a physical device they cannot make perfect sensors okay .

Their rows and columns are not perfectly orthogonal each other instead of 90 degrees perfect 90 degrees sometimes it is 89.9 or 90.1 or something like that okay so if i have such a difference that's going to make lots of calculation errors so i need to factor this one into my calculations i'm .

Not going to go into details but if you put that theta angle theta into the calculations this is what u and v should be there is a cotangent theta there is a sine theta in the formulas again how many how many parameters do i have now .

With the green one two three four and five five parameters okay five parameters uh this is image center these two are image center these two are pixel length and width and this is the this is the skew angle skew angle don't we know v0 if you know zero .

No no because this one doesn't have to be a square sensor okay this one doesn't have to be a square sensor they are not equal okay some sensors are like rectangular some sensors are like square like we wouldn't know that okay so these are the parameters if you .

Like to know if you know the position of x y z of your point and if you like to know where that point projects to in your image then you run these two formulas but to run these formulas you need to know alpha beta theta u 0 and v 0. .

Where do i get these numbers where do i get these numbers before somebody has to calculate it but maybe the whoever constructs that camera whoever produces that camera knows the f number right it knows the pixel vector excel height right it knows the theta maybe they will give me those numbers .

And i will i will use them in my calculations but usually we don't do it that way for each camera we do calculate these numbers ourselves okay we calculate these numbers we estimate these numbers these parameters ourselves and we call this operation camera calibration .

Okay camera calibration is the estimation of these numbers from the images and we are going to talk about it in one of the lectures of our course okay any questions so far i have why don't they .

Why don't they make it 90 degrees yes yeah well sometimes you cannot sometimes you produce a you produce a product you can measure the error but you cannot make it perfect okay sometimes it happens how after produce it .

They saw the wrong angle after they produce it they measured and they realized that it is not perfectly 90 degrees but 90.1 9.01 etc it is like you know this we produce this f-16 fighter planes right after they produce each .

Fighter plane they measure each other they say that it is two percent in errors so it is tolerable anything 95 we can use it sometimes they produce perfect airplanes okay no error at all so it becomes an actually event okay from the news you would hear that this month they got .

One perfect airplane etc producing something is different than measuring the errors it is easy to measure the error after you produce it okay it's one thing and it's producing the perfect thing the other thing is this even after they measure the theta angle .

Okay even the measured angle it changes when the ccd heats up during the operation it deforms and this angle changes okay so if if if the environment is too cold okay if the environment is too hot depending on where you are depending on your casing .

It changes so it is it is better for you to estimate all of these parameters yourself before you use your camera every day for some applications on the factory floor they stop all the operations every few hours and they recalibrate their cameras .

Okay any other questions i have a question sorry can we say we have an image and we change it to vector space and the image we saw in our computer or display it depends on the plane where we um why we put the plane in in that space is it right .

Yeah that's true i'm going to talk about in a few minutes so you are saying that do we are ignoring you are saying that we are ignoring where this point is with respect to camera right this point is in the real world we assume that this is zero zero .

Zero but whe
n i change my camera position the the relative distance between this p and o will change how about this okay this one assumes that camera is at the center of the real world that's not true okay so i will talk about in a few minutes .

That's a good point okay let's take ten minutes of break and we will when we come back first we are going to make this a linear matrix multiplication this calculation so so we are going to use whatever we know so far homogeneous coordinates then we are going to talk about what has .

Asked so let's be here around 14 thirty eight you you you so so you .

You uh okay so let's um let's continue from this point we said that we have five parameters now we need to estimate them but before going on further let's try to express these using a linear matrix operation like .

That one okay and what we're gonna do is we're to place all these alpha beta theta u0 v0 numbers inside this matrices and uh we'll build that way okay so that's that's it this matrix okay is called intrinsic parameters matrix .

Intrinsic in turkish each cell camera parameter matrix and it's a three by three matrix okay so before before that we said that it is it is three by four right three by four but that was working on the real world coordinates this one works on p .

Hat so whatever p hat we have multiply it with this k matrix or k k matrix and we are going gonna we are gonna find this uh point p okay so what we have here is something that transforms our coordinates from this plane to that plane okay so it is already working on homogeneous .

Coordinates of course but it is working with two-dimensional homogeneous coordinates okay that's why it is three-dimensional vectors okay look let's look at this one let's look at this one this one says um alpha is better let's let's for for a moment let's .

Ignore this theta let's say theta is perfect it is 90 degrees and in that case k would be alpha what is the cotangent of 90 what is the cotangent of 90 is it one four zero zero okay zero so it is u zero zero .

Beta and this is v0 sorry for my v okay so this is one zero zero so how is this different from how is this different from the identity matrix it is like identity matrix but we put u0 and v0 here and instead of one i have alpha and beta okay and then but then we are only .

Interested in transforming a real world xyz to the image coordinate system how are we gonna do that this is how we are gonna do that i will define a matrix m matrix m as previously is a three by four matrix the the first part is three by three is k .

This is zero so if i ignore this theta again m would be by definition alpha zero u zero zero beta v zero zero zero one and this is zero zero zero so if i multiply this with any x y z one .

Okay if i multiply it with any xyz a1 okay this would be like this is m this is p this would be like m times p and this is going to give me a the homogeneous coordinate corresponding homogeneous coordinate on my image and i i divided by z to get the two-dimensional inhomogeneous coordinate corresponding point .

Okay so now i can use all these camera parameters uh to calculate my image coordinate system but still as uh dual said before i assume that this p lives in the coordinate system or in in of this camera camera thinks that my center of projection is zero zero zero okay let's say suppose this is let's say .

Okay this is my camera right here this camera is looking at the real world this camera is looking at the real world and the zero zero is the projection of this camera okay i know that i took the images but if i change the position of this camera i change my zero zero zero but the real .

World objects doesn't change right so how is going to happen how am i gonna put this in the consideration so i need to find a way to relate camera coordinate 0 0 0 with the real world coordinate let's say the real world coordinates start some some position .

Zero zero zero and i like to express the difference between the camera zero zero zero and the real world zero zero zero uh so that i can relate my camera position to the real world position that's what we're gonna do right now okay so don't forget this one m is k0 .

It's like it's like this one this is very similar like like what did what we did before like what we did before identity where did i put that one where is where is that oh it was coming later right so where is it where is it yeah this identity one one zero .

Zero okay so instead of identity i would put alpha here beta here u0 and v0 here if i ignore the theta if you ignore if you don't ignore the theta this one and this one changes too okay so let's talk about these the difference between the camera camera coordinate system and the uh reboot com coordinate system .

Okay so i am going to define a transformation between the this is what we are going to do um okay let me draw it here let me draw it here it's like this this is my camera it is looking at the real world and this is my coordinate system inside my camera .

Okay x y and z in my real world there is a point p and the x y z of the real world is like that so this is 0 0 0 of the real world coordinates and this is the 0 0 0 of the camera itself i like to know the relationship between .

This point and that point so the difference between them can you turn off your microphone i am getting some hissing noise okay um so i like to i like to relate these two in a way that i know i mean if this p is expressed in terms of camera coordinate .

System and i know the relationship between this coordinate system and this coordinate system this is my camera coordinate system this is my uh sorry this is my real world coordinate system and this is the camera coordinate system if i know the .

Relationship between these two then i can relate this point which is expressed in the real world system in terms of camera then then i would continue that's what i'm going to do okay so before doing that i will give you some geometric information okay when you talk about xyz we use the right .

Hand rule right hand rule something like that you you do your okay you do it okay if this is your if this is your i j k x y z okay use your right thumb okay to show the direction of your z or k and if you do this this operation here .

This movement you are wrapping your fingers around this z axis you should first meet x then y because you have a right-handed coordinate system so most of the time we are going to use right-handed coordinate system that's the regular coordinate system but if you like to use the left-hand coordinate .

System sometimes it happens it is just opposite so a right-handed coordinate system this is a right-handed conjugate system x y and z okay so with the right-handed coordinate system let's say a is my .

Camera coordinate system and b is my real world coordinate system i have i j k again okay if a point is expressed point p that is a point p if this point is expressed in terms of a coordinate system how do i express the same point in terms of b .

Coordinate system okay if there is only pure rotate pure translation between these two coordinate systems see ka and kb are the same direction okay j a and j b are the same direction i a and i b are the same direction so the directions are exactly the same .

There is no rotation between these two coordinate systems only translation there is translation between them so this formula says that if p p is expressed in a if p is expressed in a okay just add this vector this vector is the difference between these two coordinate .

System just add this vector to p a okay then you will express your point in b that's it so i need to know how many how many how many parameters do i need you know to make this transformation what do i need to know how many numbers do i
need to know .

To make this transformation three just three right so what is the what is three it is just the x y z components of this vector that's it three numbers would be enough okay three numbers would be enough to make this translation transformation so that's one thing the other thing is .

There might be a rotation between these two i assume that there is no rotation so if there is a rotation between these two uh coordinate system remember red is a green is b they don't have any translation difference but they have rotation difference between them .

But in that case you need to multiply a point ap p is expressed in terms of a coordinate system you need to multiply that a with a rotation matrix this mutation matrix will transform this point from a coordinate system to a b coordinate .

System so this is a three by three matrix okay so this is a three by three matrix special rotation uh matrix and this three by three modulation matrix by definition is orthogonal orthogonal matrix that means that r .

Transpose times r is identity also i mean what does that mean if r transform are transposed by r transpose times r is identity then that means that i transpose is equal to what r just inverse right r minus one so these .

Kind of matrices are called orthogonal matrices okay orthogonal matrices so question you told me that three numbers are import are just enough to express this pure translation transformation right how many numbers do i need to .

Express this pure rotation case how many numbers do i need do i need to express this it looks like i need i need a three by three matrix right it looks like nine numbers but it is much lower than nine it is not nine no it is lower than six where did you get six .

Well all i need to know is what is the angle difference between the x coordinate system that the i mean what is the difference between this k a and i a e like this sorry k a and k b yeah okay here it is what is this angle .

That's one thing okay so that's the difference between the k components of these three quadrants right how about the difference between the j components j components are like that so this angle and then how about the difference between the .

Um i components so ib and ia so that would be the third one so three numbers are enough so why do i need this nine numbers so if i if i put it in nine numbers case this operation becomes a linear operation with a simple matrix .

Multiplication i can do that with a simple matrix multiplication i can do that so this is good but this is not good this is addition addition is not good because addition is not linear okay i need i need multiplication i need multiplication so .

One of the advantages of using homogeneous coordinate system is that this addition can be expressed in terms of multiplication with the homogeneous coordinate system and i'm going to show you in a few minutes okay so by the way just to express it in a more simpler simply .

For two-dimensional world let's say i have this on a simple plane okay i have only i and j components x and y that's it okay if i like to if i like to transform a point from this red coordinate system to green coordinate system the only difference is this angle right .

The angle is different so if i if i'm looking from top so what do i do i need there is only one parameter that's just one parameter this is the tether angle angle theta okay so how do i use this so this is how you convert that information to this two by two matrix because in this case .

I have just two dimensional word okay in two dimensional world this transformation is just two by two matrix but only there is only one number in this matrix it is just data all the rest are derived hard right from this data okay so this is just one example see this one is orthogonal .

Orthogonal matrix if i multiply this with transpose of it it is cosine times cosine cosine square right and then cosine times cosine theta sine theta and it's going to be zero it's going to be one at the diagonals .

And it's going to be zero at the other end of the matrix so it is orthogonal okay good so so this is the general case i have two coordinate system there is rotation difference and there is translation difference so what do i do i take this point p i rotate it and i add this vector .

Rotation first then addition of the vector but again this is not the linear stuff i need a nice linear i need a nice linear operation to transform this uh to transform this uh point from one word to another word okay and then of course i have the homogeneous coordinate system .

This is what i am going to do okay i will put a one on under this p so it is going to be a four dimensional homogeneous coordinate system okay and i will i will multiply it with this transformation matrix and i will get another four dimensional homogeneous coordinate system so what is the size of .

This t can can somebody tell me what is the size of t it's a matrix right four four times four it is four by four right four by four why because this one has four elements and this one has four elements so it has to be four by four .

And this is the definition of that matrix okay how could this be a four dimensional why because this is three by three right three by three this is all zeros zero zero 0 and this is just three elements this is one .

So four by four matrix multiplied by this homogeneous coordinate point four dimensional point and i will get another homogeneous coordinate in four dimensional world okay so instead of doing addition and multiplication i just expressed .

This thing in a single matrix multiplication which is perfect that's what i want actually that's what i want so how does this happen this happens like okay this happens like that rotation translation 1 and 0. how does it work .

So rotation is multiplied by this one so it is trying to say that if you do this multiplication this is what you get and it's same as this one r times r times p okay t times 1 and if you add them together you would get the p prime that's because i am adding this one here .

Ok so instead of using an addition i express everything in homogeneous coordinates and i don't have any addition anymore everything everything is expressed in terms of multiplications everything is linear actually this is the major maybe the .

Most important reason why we switch to homogeneous coordinate system because with homogeneous coordinate systems all the things can be expressed in terms of matrix multiplication okay good any questions so far okay so but i didn't talk about the all did all we did was all we did .

All we did is i expressed this p okay in a i know that let's say this is our real word coordinate system okay i know the position of this p in real world coordinate system and i like to express this in camera coordinate system so this is what i do .

So i take this p a and i multiply it with this four by four matrix r t one zero and i would get p pb okay so i have let's say this is real world coordinate system and this is a camera coordinate system .

Still my point is in three dimensional world i did not express it i did not express it in image coordinate system so how do i express this in image coordinate system i would take p c and i multiply it with this m remember m we have m right m is our um .

M is our intrinsic camera matrix okay and remember if if i do that that would give me my small p image coordinate system so i am going to combine this matrix with this m and i'm going to get one large matrix and that's going to be my my .

Projection matrix that's what we are going to do okay this is the formula actually r is the rotation matrix t is the translation vector so these two are are for relating the real world coordinate system to the camera coordinate system .

Okay so there are three number
s in it although this is a three by three matrix there are three numbers in it there are three numbers in it so six numbers come from the rotation and translation and this is my intrinsic camera parameters .

And if i multiply these two i would get my projection matrix any point in the real world in the real world coordinate system multiplied by m will be my image coordinate if i divide it by z i will get there the non-homogeneous point okay so this is the matrix k is a three by three .

Matrix r is a three by three matrix t is just a three dimensional point okay three-dimensional point and if i express everything in terms of matrix calculations this is the matrix m which looks complicated but let's let's look at it .

So this tz is the third component of the translation vector so this is a is this a vector or is this color is this a vector or this is this as color it's just color right because it is the third component of the translation vector right how about this one .

This is the third row of the rotation matrix okay so it is a three dimensional it is a two it is three numbers so this one is just one number two number three numbers and this is just a single number and again this is a scalar again .

Again another scalar and this is a row vector row vector so m is a three by four matrix m is a three by four matrix three rows four columns which we call a projection matrix okay m is a projection matrix given a point p in the real world you know in homogeneous coordinate .

System if i multiply p with m i would get i would get the corresponding point on my image plane as a as in in in homogeneous coordinate system so i have been i've been i have been working on this for the last two weeks .

Finally i am showing you what i what i get so this is this one includes lots of data actually how many parameters are there let's look at this one how many parameters are there let's let's count it okay uh the the elements of the translation vector .

One two and three okay these are translation vectors and for my for my rotation matrix okay although there are three rows although there are three rows each one includes three numbers i consider it as a three number okay so .

For the translation there are three for the rotation there are three then then i have my intrinsic parameters which are beta alpha theta and view zero and v zero and for the intrinsics i have five numbers so total of .

11 numbers total of 11 numbers determine this matrix and projection matrix m how many numbers are there in m that are 12 numbers right because it is a 3 by 4 matrix but to make this m i need these 11 numbers if you know these 11 numbers then you .

Did your camera calibration okay these rotation and translation components of the camera parameters are called extrinsic camera parameters okay these are extrinsic camera parameters these are camera parameters exclusive camera parameters change .

Whenever you change your camera position where is my picture camera okay i guess here it is okay if i move my camera okay or maybe i can do this if i change my camera position from this position to that position okay now i have to come up with another rnt so this is this is r .

2 t t2 and this is r1 k1 but when i change my camera position my camera intrinsic parameters doesn't change my camera interesting parameters don't change only next but only the extrinsic parameters change .

So out of those eleven numbers six oxygen change five of them stay the same okay so in fact when you say i have calibrated my camera it usually means that you found out that those five numbers alpha beta theta u0 and v 0. .

Okay so you have been so quite can i have some questions get this out you like here of course i am not expecting you to memorize all this stuff but you should you should know this m is k r t k is three by three r is three by three this is three .

Dimensional okay total of m is three by four k is our intrinsic parameters extensive and extensive okay any questions if nobody has any questions i will i will force you out to ask me a question um okay okay so we said that .

The rotation matrix was an orthogonal metric uh so if we uh if we reduce an orthogonal matrix uh let's have the rank of 3 will can we uh reduce to it such that it will have only three numbers that are important yeah yeah but it's not a full matrix so of course i .

Mean r is not a a full matrix okay it is only a rotation matrix and we are going to we are going to take advantage of this r okay a lot okay it is not that property comes from like the matrix being orthogonal right yeah .

Okay base by definition has to be orthogonal and when we recover what is the camera calibration calibration means that we are going to recover all those numbers from the images when we recover our okay we will assume that .

This rank is not full its rank is two so we should find the matrix whose rank is just two if you find the matrix there whose rank is three then you are doing something bad so we there should be one way of enforcing this constraint by the way um if this is .

If m1 is your projection matrix if m1 is your projection matrix okay then let's say k is a scalar number okay since this is in homogeneous coordinates k times p is in the same point right it's the same thing i can do .

That so i can multiply k with i have i can multiply m with k it is still my same project matrix okay so that's why m the degree of freedom of m is not 12 even though there are 12 numbers and it is 11. okay good let me open up some .

Let me open up okay this is good this is from some other source it's a good source okay so we are saying is that i have a real world coordinate system i have a real world point which is expressed in the real coordinate system x y z one in homogeneous coordinates i multiply it with a .

Three by four matrix and i get this s x s y s inhomogeneous coordinate system and if i divide everything by s then i would get my points in inhomogeneous coordinate system and you may i can decompose this to a three by four matrix .

Into intrinsic parameters okay and the extrinsic parameters but some people like to express it this way they say that okay intrinsic parameters without the skew angle f as x remember alpha the definition of alpha was this right as x is the .

The width of the pixel sensor pixel sy is the height and if i multiply them with f then i would get alpha and beta these are the center point u 0 v 0. this is for projection remember how we did the projection this is the projection matrix and this is by .

4 by 4 matrix for the rotation and this is the four by four matrix for the translation okay identity zero one transition rotation zero zero one translation rotation projection and intersects so this makes the whole matrix pi or m .

Okay the definition of these parameters are not completely standardized okay it changes some people like to include this tata some people don't like it okay it makes it a little bit clear that they're scary but it is not actually okay good so let me open up the opencv documentation .

Uh regarding the regarding the camera calibration and that kind of stuff camera calibration all right well no i don't wanna am i opening okay good so this is how you do the camera calibration you print out a checkerboard image like .

That okay you print out a checkerboard image like that and you show this image to your camera in many different positions and angles okay so once you do that once you do that okay you get lots of data and you detect these corner points using the .

Corner detection algorithms of opencv okay you get from all the images and then you get your you get your camera matrix using the using the let me show you where is it yeah calibrate camera function you get your points okay image points and it says that .

vector okay this is your rotation vector i will talk about these distortions in a few minutes and this is your intrinsic camera matrix okay and i don't know what our red return is i think it is some kind of a status let's go to the .

Let's go to the function itself let's get the documentation okay yes calorie camera this one there is the documentation for calibrating okay here it is so return value camera matrix camera matrix .

Is the intrinsic matrix the rotation vector and the translation vector r t rec distortion where is return returned object return return value flags did you see it return value .

Yeah i'd like to know what kind of error it's on the ground did you see it yeah after the parameters after the parameters return the idea rms error oh okay so it is running some kind of an optimization if everything comes back at zero it is perfectly optimal if it is not then you are going .

To get some some value that's the return value okay so so this is the camera but you see inside the camera matrix this is what you have f x f y c c x and c y this fx and f5 are not actually focal length in focal length and um in millimeters i mean there is only .

One focal length right there is no x focusing the y vocal length fx and f5 are the pixel sizes in terms of focal length okay pixel size in terms of focal length cx and cy are the center positions okay this is how you do the camera calibration it's very convenient it used to be very very .

Difficult like 20 years ago no 25 years ago it used to be difficult 20 years ago it became easy with this approach this approach i mean this this this way of calibrating camera is difficult to explain a little bit but it is very easy to use .

There is another method which is easy to explain but difficult to use since we like the easy stuff for me i am going to show you how to do the camera calibration in a easy way but difficult to use way but i am not going to ask you to do the camera calculation with that method okay let me go back to my slides and .

Continue from that point but it looks like again i am out of time do you have any questions do you have any questions question for you i think i asked i asked this question but i love asking the same question over and over again let's say i have a camera .

I have a camera i have a camera blue camera okay it is looking at the real world and there is a point a and there is a point b okay this this camera is producing these images a and b small a small b let's say i know the i know the .

Projection matrix m of this camera so what do i do i will say a is equal to m times a b is equal to m times b right i can do that i know my projection matrix okay if i take the inverse of this projection matrix so this is in three-dimensional world .

This is in two-dimensional world right okay if i take if i take the inverse of projection matrix a that would give me capital a right so by looking at the two-dimensional point i can get back to the three-dimensional point .

That produced this point right can i do this so that's kind of using this linear algebra stuff using this linear algebra stuff i can determine the depth of points i can determine the depth of points from their image representations .

Is this a nice property of this matrix projection matrix tell me what you think probably it won't be right because we we lost one dimension and we projected the real image yeah yeah well what you are saying is a very true result but how could you .

Explain this then i am using the linear operations i can write the inverse of a matrix and i can do this multiplication right my equations are telling me that i could do this is but you should be able to the equations i love it but .

What razor is saying is true i mean this is projection b projection we lost one of the dimensions how can you get how can you get the point back from some lost information suppose a and b are like this there their beam is just at the same point i mean it doesn't matter if it's a .

Or b they are producing the same okay let me change this picture they are producing exactly the same point right so how would this equation know the difference between a and b so what is what is this equation then i mean this is some kind of equation i know this is .

M is invertible at least to some degree with the zero inverse inversion or something i can write this but i know that i am not going to get a because if i i'm going to get a then this would be b but i know that a and b are the same point on my image so what does this equation give me what .

Is m inverse times a or m inverse times b what are these i think it should be very clear from the picture look at the picture what is this equation what is this equation or what is this formula .

It is the formula for this line okay same thing here it doesn't give me a single point it gives me a line that line contains infinitely many points that can produce this this this point here so it's not going to give me the perfect point with the depth .

But it's going to give me a line along which infinitely many points live okay that's nice good say yes yes it smells a little bit clear for me okay good okay let's take timeless of break after the break we will we will continue and i will continue a little bit about .

This and i will do this actually let me do this okay let me do this so that i another camera okay let's say this is m1 this is going to be m2 and it is looking at this um a .

And b right we are going to look at this when we come back so let's be here around 15 38. uh oh is .

Um um um doing that rashford busy uh oh so .

Alright uh so okay people let's look at this image again so if this is my image plane in this camera m2 okay the points of amb will be on different .

Positions they will not be on the same position right so if i update these equations this is m1 inverse m1 inverse okay okay this is what i expect this will produce a line the same line but if i do this m .

M 2 inverse times let's make this a 1 and b 1 because this is a1 b1 and here is going to be a and b a and b a2 b2 okay and if i do this m2 times a2 and m2 times b2 and these will be the same .

Right okay so that means this is another line come on this is another line this line corresponds to the green line here this one corresponds to the red line there .

So if i intersect these two lines then i am going to find the position of a and b if i like okay so if you have the images of if you have two images of the same real world scene from two different angles then it is possible to recover depth of of a of a point .

If you know whether points are on your images okay so usually we do it this way a point a what is going on now okay point a times m1 is equal to a1 and point a times .

M2 is equal to a2 right what do i know in these two equations i know all these two half numbers i know these 12 numbers i know these three numbers i know these three numbers what i don't know is the three numbers in this a right three numbers in this a .

Then then i can i can make this uh i can make this one large linear equation and solve for this a okay so this is not exactly making a stereo system but it is close so it is multi multi multi-view geometry multi-view camera systems and we will we will do lots of stuff .

Like that later in the semester okay it is not possible to recover the depth from a single image but if you have more than one image of the same scene then yes it is possible okay let me let me continue okay so opencv has lots of nice uh functions uh the function for camera .

Calibration i showed you another nice function of opencv is this project points is very good okay let's say you have points you have three dimensional you have two dimensional points expressed in in homogeneous coordinate system okay you have three dimensio
nal two .

Dimensional points expressed in homogeneous coordinate system and you give your rotation vector translation vector camera matrix distortion coefficient we are going to talk about it okay and and your this these will be your inputs this part .

Will be your inputs all of them okay if you give them to your project points function and uh it will take these points these points are like your points p okay these are your p get rid of this one and these are your .

M and this will be and this will be your outputs p okay this will calculate and we give you the results project points okay another good uh another good opencv function is find homography you will say that .

I have two images on image number one my source points on image number to my destination points okay and find me the homography that is three by three matrix that connects these two images together assuming that these are owners on the same plane okay so please study .

Opencv documentation try to see what kind of functions available for uh homography finding the homography calibrating your cameras doing the projections and etc okay so i kept saying that i will talk about these distortions let's talk about the .

Distortions we said that on a projectivity projectivity lies in one word become stays as lines okay if if this is real world if this is real world okay i have lines those like those and this is my camera it is looking at .

The real world like that on my image i produced those two parallel lines could intersect it could it could behave like that but they will stay there as lines that what the projection matrix does okay this projection operation okay doesn't .

Change the the lines into something else that's what they assume but unfortunately the lenses that we use don't produce images like this okay the lenses that we use produce images uh in the center maybe the lines stay as lines but when you get away from the center .

For example here this is a very nice line with the green this is a nice line this is a nice line but when you get away from the center things become either like that pincushion distortion or they become like that better distortion okay .

I mean it should be like this dashed line but it is becoming this like pink cushion line this is called the lens distortion we cannot make perfect lenses lenses behave like that okay especially this room lenses cheap lenses .

They change the picture a lot they change the picture a lot so and that's going to mess up all of my calculations of m times p because i assume that everything is linear lines lines stay as lines and circles stay as circles and etc no it doesn't it doesn't it doesn't work .

That way sorry circles doesn't stay as circles sorry line stay as lines linearity will be will be will be there okay so uh it's caused by invertebrate classes and and the deviations it becomes more pronounced uh at the corners of the images actually .

At the corners they are even worse at the corners they are even worse so what are we going to do we are going to find the distortion amounts and we are going to undistort our images so if my image is like that or like that we will undistort them into this into this shape and then .

We will run everything like that so if you realize that if you realize that the distortion is not there at the center there is no problem at the center when you get away from the center distortion become distortion becomes apparent so i am .

Going to measure this how far i am away from the center this d is important if t is 0 there is no distortion but if d is large then loss of distortion so the distortion amount here is this much distortion amount here is this much it has to be here but it is .

Here right at the center there is no distortion i need to find something like this so i am going to modify it with the i am going to model it with radial distortion so radially radially no distortion around here when the .

Radius gets larger distortion becomes larger okay so i'm going to measure how far i am from the centers if i am too far away then distortion is going to be large and they came up with this formula okay this is a good example this is a distorted image is this a .

Pincushion distortion or radial distortion pin cushion or radial sorry pin cushion or barrel it is battle distortion battle is like a battle you know the battles this is a battle right so this is a object like that you put your liquids in it .

Battle okay so this is a battle distortion and i like to convert this better distortion and get rid of it and get the linear image so see all the lines are lines this is a line this is a line it is a line right in this case no these are not lines it's a barrel .

Distortion which one do you like better which one do you like better in terms of using our equations this is better but some people prefer this one can you tell me why they prefer this one they say that this one is better because of focus in the center well i mean i produce this image from .

This one so there should not be any focus difference between these two i'm not changing any focus if it is sharp and this image is going to be sharp here if it is blurred it is going to be lowered here you can't drop the image take more space .

This one takes more space just make it smaller the upper one this one takes more space yeah the center of the image oh yeah so you like it better in that case center of the image it takes more space so you can focus it yeah kind of maybe but well usually the idea is this if i look at .

This part of the image can i see it this is a 90 degree corner right this looks more natural than this one this doesn't look natural so this the the one that is distorted radially looks more natural locally okay locally it is more acceptable okay only locally so circles .

Look like more circles if you look at it in a from a distance that's not good for example this one but if you look at the other one it looks weird at some point especially at the end of the image like these these areas okay so some people might prefer these kind of images .

Okay so let's go back and let's try to see how we can get this image out of the distorted image it's called distortion elimination modeling distortion first first okay this is our projection x hat over z hat okay we know this is very simple projection .

And then applied radial distortion radial distortion is this find the distance between this point and the center and that r squared is going to be important i am going to multiply r squared with k1 and k2 kappa 1 kappa 2 whatever okay so this is the square of the r and this is the fourth power of the r .

Add one and multiply with x and that's going to be that's going to give me my new position undistorted position and after that if i apply f to that one i would get there i would get the um image image i would get the normalized image coordinates representation of the same point so modeling the distortion means that .

Finding this kappa one and kappa two and if i'd like to continue actually maybe i can do k times r to the power six okay so once you find all those numbers you can there is an undistorted image function of the opencv you would get this image and now after that .

You would calibrate your camera using this image okay so two more parameters intrinsic parameters for your camera okay so that distortion stuff here means that i am giving i am going to give you the distortion coefficients and you're going .

To use that in the projection operation if you go to the um camera calibration routines again okay remember the calibrate camera stuff yeah here it is this one gives you okay camera matrix which is the intrinsic parameters distortion coefficients .

Rotation vector translation vector etc okay so opencv has a nice way of getting rid of these distortions usually we do
n't prefer this kind of stuff we try to use good lenses lenses that don't introduce any distortion that's what we try to do okay good uh one other type of .

Projection at the end of this chapter maybe uh well there are infinitely many types of cameras actually that you can come up with let's look at some of them somebody has this idea okay this is your normal lens lens is looking upwards lens is looking upwards okay .

This is a lens somebody has mounted this mirror on top of this lens this is a hyperbolic mirror see this here here i'm trying to fix my pen okay reboot the pan this is a hyperbolic surface mirror see that one so when this lens looks upwards .

Okay if i have a picture okay this one it will see like that okay this one will be like that okay so this one is seeing all around this camera like this okay it is saving whatever it has around it so it is seeing like 360 degrees in the center it is broken why because .

In the center i have this uh post okay that's supporting a bar in there but around it i have it okay so this this is called a 360 degree field of view camera and there are many versions of it so you are seeing all around yourself and usually autonomous vehicles .

They use this kind of autonomous because they use this kind of images okay another one is tilt shift camera okay with the third shift camera this is what you do right do you remember the image that i showed you to show to to give you the idea .

About the depth of focus let me go to that picture actually let me show you the depth of focus thing there is my adapter remember those flowers flowers where did the flowers go yeah here it is okay so suppose .

This image plane is tilted a little bit like this okay if this image plane is tilted like that if the image plane is tilted like that this point here this blue point will be in focus like that but this this green point .

This green point here will not be in focus because this green point will appear here and that will be blurry right but here this red point will be in focus so the focus area will be something like something like this okay so .

This is what tilt shift lenses do okay by tilting the lens shifting the lens you get different kinds of effects okay this is one of the effects so they this is a real image this is not a image of a toy city this is a real city but if you use this tilt shift effect .

It looks like a toy image again here another one or i think this one is a toy a toy or i i don't know i i am adopting myself now but definitely these are the shift effects okay another sensor is this can you tell me what they did can you tell me how they took this .

Picture of this cup or the cylindrical structure well kind of but it is not it is not the regular classical stitching it is something else but the idea is the same the the cameras that we use so far are three dimensional cameras right they look at the three-dimensional world and .

They produce two-dimensional images how about this i have a camera i have a camera how should i draw okay this is my image plane this is my image plane okay or maybe the other way this is my image plane and in the real world let's say .

I have a circle like that the central projection the circle will appear like that right this is enemy plane but let's say i am using only the center column of this image plane okay just a single column i am ignoring the rest i am ignoring the rest .

So my images are just one dimensional so i am looking at two dimensional words i am producing one dimensional image okay so my lens is not like spherical lens is like a planar lens you may take it that way okay so what i'm doing here is i take such a camera by the way scanners .

Work like this on a scanner you have a sensor that has only one line of light sensitive elements it's not two-dimensional on your scanners you don't have two-dimensional sensors you have a single dimensional sensor okay and you move your camera on top of .

Your on top of your uh scanned document and you stitch them together as do i said okay you move your camera on top of your scanner the the scan document and whatever you get you put them together and it becomes a two-dimensional .

Uh two-dimensional uh image that's how the scanners work okay that's why the the scanners have very high resolution why because they don't have like 1000 by thousand sensors they have 4 000 by one sensor okay and if you take the image of the .

Document by moving the sensor maybe if 5 000 times so it produces 4 000 by 5000 image at the end this one is doing the same so if you rotate this object in front of a in in front of a one dimensional camera it's not already measured by one dimensional camera .

And if you stitch them together this is what you get okay so it's a specialized specialized scanner you might say okay good and one final and very interesting camera is the photo finish camera photo finish cameras are very old in .

Fact they are invented at the beginning of the 1900s for horse races photo finish horse races and still we are using them to find there uh to find who is who is who is who has finished the race first and what are their times and this is an example this is very .

Similar to this one but of course the application is different again this is like a scanner camera it is taking one-dimensional images okay let's say this is your track running track okay .

People are running in these lanes okay and i have a camera here looking at the finish line okay but this camera takes only one dimensional images okay if this place is empty nobody is running what kind of image would you get okay this is the image that i would get .

It would be all lines right because it is all taking the same image over and over again and if i put them together i would get the lines right but if people are running at the position where i took the image at the position time position where i took the image .

If i see a a a a runner i would record that image and that image it would appear here and for the second person it would repeat here and the third person it would have happened here so time is going that way and what is happening that's why this looks funny .

Okay nobody has hands like that okay nobody has a left leg left like this thick and right like this thin right look at these ankles because these are all taken each column of this image is taking at a different time okay so this column here is taken at 21.8 second .

And this column is taken at 22.8.33 so there is there is there is more than one and a half second difference between these two okay so this is photo finishing well i said that beginning of 1900s this was invented right so but within other digital cameras then they used to use this they used to do .

This with their film cameras okay so one-dimensional cameras so they are they are moving the film all the time and they are getting these kind of images by the way nowadays they are selling one-dimensional cameras a lot they are called scan cameras .

Okay it's not area scan but line scan let me put line scanning cameras okay line scan cameras are industrial cameras they are used for inspecting okay here it is it's a very nice image so this is the sensor this is the sensor of the line scan camera see .

At the center there is only a single line i don't have a two dimensional image and and this one shows it very nicely so let me let me let me try to go to okay good the source page and show you the difference between line scan camera and what happened .

Did i get it no maybe choose some other yeah well this one was good because it was comparing line scan and the area scan soda vision whatever it is okay the the satellites that orbit the earth the observation satellites that take the .

Images of the earth all the time they are line scan cameras so this line scan camera is moving on this ima
ge on this on this target and every time it gets a single line when this moves from one position to another position .

Okay you get another line and another line so each line would become your columns and if you put them together you get two dimensional image area scan camera just takes two dimensional images all the time okay so that's the difference with the satellites i have the same effect .

Lines can satellite they don't take two dimensional images because the the the satellite is moving all the time okay so instead they have a very nice one-dimensional camera like that one and it is taking the images line by one line then you put them together .

Sometimes we have an image by 20 000 by 4000 right it is very large okay good i couldn't find the exact image that i cameras oh no i wanted to show you the camera itself of the satellite i couldn't find it sorry okay so uh that's it for this chapter do .

You have any questions about the camera camera geometry that kind of stuff i have a question sure i thought i thought think about the flower that they are photographing it at the same time we displayed it to a monitor before this .

Day i thought um in the light of power of light which which came to censor is to one mattress maybe and then we manipulate it to enhance it or something else but i think now .

Now we are we are not talking about a mattress point uh we are we are talking about a matrix but we have lines uh vectors yeah we are talking about lines and not dots okay did i understand ranks because it's very hard to understand well then well actually i mean it .

Depends on the problem at your hand if you are going to talk about three-dimensional world then it is much easier to express everything in homogeneous coordinates and it is easy to talk about the vectors but if you if you are going to do image processing image .

In an image out then you don't have to deal with the homogeneous coordinate system or the three-dimensional world or the vector representations all you care is just two-dimensional x y integer pairs right second column third row that's it okay depends on what you are doing .

If you like to talk about the depth then you will have to talk about the homogeneous coordinate system okay and getting inferring the depth from these two dimensional points well in that case we are doing three-dimensional computer vision which is the main topic of this course .

This is not an image processing course this is three-dimensional computer vision course it's a new point of view for me and it's very interesting sure yeah well yeah well it is interesting yes i mean but using a single image it is really difficult to get the depth you need more .

Than one image so that means that you need another camera or you need to take video of the same image while the camera is in motion so that's why we are going to look at them separately more than one camera or camera in motion okay good any other questions .

So unfortunately to be able to do something about in three-dimensional world we need to look at the images more carefully and i need to introduce some ideas about about image processing commonly used techniques about image processing so let's look at some image processing .

Background okay let me try to find my slides about image processing okay here it is okay we are going to make some introduction actually so these slides they are mostly from university of central florida from dr .

Shah i think i took most of them from him but who took most of these slides from somebody else and i think i mix these with some other sources so i don't know where i got them but these slides are not mine sorry okay i don't take take credit for these .

So with this topic i am going to talk about filtering submitting removing noise convolutional operation and some image derivatives and finding the corners and etc so image types we already see them these are binary images binary images contain all the ones and zeros .

Okay so it's a two dimensional matrix again but it is ones and zeros then we have grayscale images where each pixel is a single number single integer between 0 to 255 or larger maybe with the color images for each pixel we have the red green blue components remember rgb .

Rgb comes from the human eye okay in our human eyes there are three types of color sensors color the cells that that are sensitive to red and blue and green okay so these are the color images and how this is how you represent the binary images columns and rows and zero and black okay .

And these are the gray level images and these are the and these are the then we have the the the color images which have blue red green channels okay so you may think each channel is a great scale image okay people talk about the histograms a lot what is the meaning of histogram the .

Meaning of histogram is this i take this image okay let's say this is 100 by 100 image so there are total of 10 000 pixels so i take a statistics of how many times i have seen grade level 50 okay so this one says at the gray level 50 i have seen it more than thousand times but grade level .

Let's say 25 i have seen it only 500 times etc okay so this is called the image histogram okay this is called the image histogram so image histogram tells me how frequently each gray level occurs in the image .

How about this part i don't see any gray levels above 140 right so that means that this image is darker right i don't see anybody i don't see any bright parts on this image which is a bad thing why because if i had used all the gray images .

Maybe i could have obtained the i could have obtained the more contrasty image i am using the whole light levels okay so image histogram is something like that okay so there is there is always going to be image noise on your images .

What is an image noise image noise is this if you are using an electronic camera sensors they are very good at measuring the light amount but they are not perfect okay given the same light amount sometimes my sensor is going to produce let's say there is light coming into my .

There is light coming into my sensor light is coming to my sensor this is my sensors okay it is producing let's say 1.27 volts for this light okay after a second i measured it now it is producing 1.28 volts it may happen okay this light did not change my .

But my electronic device is producing a slightly different measurement this is called noise okay and this noise is going to result in slightly different pictures of the same scene from time to time okay so one of the one of the one of the main reasons of image noise is the .

Camera electronics or sometimes it happens because of the light variations soft surface reflectance sometimes lenses are not perfect okay and we assume that noise is random and it occurs with some probability okay and we assume that the distribution usually we use cause distribution to model the noise .

You will say that additive noise this is 0.0 usually i don't have much noise but sometimes i would get 1.0 additive or minus 1.0 subtractive okay so the the amount of noise that you're going to add to your image is proportional to this .

Distribution you take you you take a sample from this distribution and you add it to your image okay and since cause distribution is very well understood and it is very convenient to use and it is realistic too we usually use gauss er distribution .

To model our okay why modeling the noise is important because once we know the model of the noise then we can do some tricks we can develop some techniques to get rid of the noise or at least lessen the effects of it okay well it looks like i am out of time .

For this week are there any questions are there any questions about this week's classes lectures or the the one that we started right now next week i'm going to con
tinue with this one with the noise and everything and then we will talk about this derivative stuff .

Derivative is going to be important because we like to take the derivative of the images today in turkish right derivative of the images in x and y directions um while we are taking the derivative we are going to see how the image changes from one pixel to .

Another pixel that's going to give lots of ideas about what is what interesting stuff happens in our images such as the edges or corners or the constant regions and etc okay good well john i had this course on uh last week on address yes will you .

Share this slides i i i i uploaded the slides in my team's page also you have the syllabus available in the themes page in the files area okay they are all there also the videos of the last week and the previous week is available .

At the themes page also on my youtube channel so watch the videos of the first and second weekend you will know all the details about how we are going to go with the course generally okay any other questions good i will see you next week then .

Thank you thank you bye so ugh um you