Video’s on how to use vRealize Orchestrator

A few weeks ago I presented on Automation at the Louisville VMUG.   During the session I mentioned that using vRealize Orchestrator is really a good skill to learn.  It allows you to become the orchestrator of external services.   It’s a critical skill going forward.  I didn’t want to bore the group by watching how to video’s but promised to post them.   Here they are:

Part 1

Part 2

Enjoy

Remove a vCenter from a PSC Domain

With the new architecture for platform services controller (PSC)  I expect we will see a lot more vCenter’s joined to a central platform services controller.   At one point I had two vCenter’s connected to one PSC.  I deleted one vCenter but continued to get the error:

Could not connect to one or more vCenter Server systems.

vCenter-Gone

My PSC log was riddled with connection failures and it seemed to run really slow.   Removal is easy and thanks to the VMware team it’s well documented.   In my case the PSC was a Linux appliance so I ran the following command:

cmsso-util unregister –hostID 1234 –node-pnid vCenter2.griffiths.local -username administrator(Replace this parenthesis with the @ sign)vsphere.local –passwd password

Removal

This is all documented in the following KB: 2106736.

I could not locate an explaination of what hostID to use so I duplicated the one from the KB article and it worked.  My errors are gone.

vRO Scriptable task to kill idle vCenter sessions

This is a recommended best practice.  Kill vCenter sessions that are older than 24 hours.   This scriptable task will kill all idle sessions that are older than the variable maxidletime.   Cut and paste and go.


vRO add all virtual machines to NSX exception list

Almost everyone is using a brownfield environment to implement NSX.   Switching the DFW firewall to deny all is the safest bet but hard to do with brownfield environments.   Denying all traffic is a bad idea.  Doing a massive application conversion at once into DFW rules is not practical.  One method to solve this issue is to create an exception for all virtual machines then move them out of exception once you have created the correct allow rules for the machine.   I didn’t want to manually via the GUI add all my machines so I explored the API.

How to explore the API for NSX

VMware’s beta developer center provides the easiest way to explore the NSX API.   You can find the NSX section here.  Searching the api for “excep” quickly turned up the following answer:

api

As you can see there are three methods (get, put, delete).  It’s always safe to start with a get as it does not produce changes.   Using postman for chrome.  I was quickly connected to NSX see my setting below:

n1

The return from this get was lots of lines of machines that I had manually added to the exception list.  For example the following

n1

Looking at this virtual machine you can see it’s identified by the objectID which aligns with the put and delete functions the following worked perfectly:

Delete
https://192.168.10.28/api/2.1/app/excludelist/vm-47

Put
https://192.168.10.28/api/2.1/app/excludelist/vm-47

A quick get showed the vm-47 was back on the list.  Now we had one issue the designation and inventory of objectID’s is not a construct of NSX but of vCenter.

The Plan

In order to be successful in my plans I needed to do the following

  • Gather list of all objectID’s from vCenter
  • Put the list one at a time into NSX’s exclude list
  • Have some way to orchestrate it all together

No surprise I turned to vRealize orchestrator.   I wanted to keep it generic rest connections and not use NSX plugins.   So my journey began.

Orchestrator REST for NSX

  • Login to orchestrator
  • Switch to workflow view
  • Expand the library and locate the add rest host workflow
  • Run the workflow

1

2

3

4

  • Hit submit and wait for it to complete
  • You can verify the connect by visiting the administration section and expanding rest connections

Now we need to add a rest operation for addition to the exception list.

  • Locate the Add REST operation workflow
  • Run it
  • Fill out as shown

1

You now have a put method that takes input of {vm-id} before it can run.  In order to test we go back to our Postman and delete vm-47 and do a get to verify it’s gone:

Delete:

https://192.168.10.28/api/2.1/app/excludelist/vm-47

Get:

https://192.168.10.28/api/2.1/app/excludelist

 

It’ is missing from the get.   Now we need to run our REST operation

  • Locate the workflow called: Invoke a REST operation
  • Run it as shown below

1

2

3

Once completed a quick postman get showed me vm-47 is back on the exclude list.   Now I am ready for prime time.

Creation of an Add to Exclude List workflow

I need to create a workflow that just runs the rest operation to add to exclude list.

  • Copy the Invoke a REST operation
  • New workflow should be called AddNSXExclude
  • Edit new workflow
  • Go to inputs and remove all param_xxx except param_0
  • Move everything else but param_0 to Attributes

1

  • Let’s edit the attributes next
  • Click on the value for ther restOperation and set it to “Put on Exclude List ..” operation you created earlier

1

  • Go to the Schema and edit the REST call scriptable task
  • Remove all param_xxx except param_0 from the IN on the scriptable task

1

  • Edit the top line of the scripting to read like this:

var inParamtersValues = [param_0];

  • Close the scriptable task
  • Click on presentation and remove everything but the content question

1

Now we have a new issue.  We need to have it not error when the return code is not 200.  For example if the object is already on the exception list.   We just want everything on the list right away.   So edit your schema to remove everything but the rest call:

1

 

Put it all together with a list of virtual machines

Time for a new workflow with a scriptable task.

  • In the general tab put a single attribute that is an array of string

1

  • Add a scriptable task to the schema
  • Add a foreach element to the schema after the scriptable task
    • Link the foreach look to the AddNSX workflow you made in the last step
    • Link vmid to param_0

1

  • Edit the scriptable task and add the following code:

//get list of all VM’s
vms = System.getModule(“com.vmware.library.vc.vm”).getAllVMs();

var vmid = new Array();

for each (vm in vms)
{

vmid.push(vm.id);

}

 

  • Add an IN for vmid and an OUT for vmid
  • Run it and your complete you can see the response headers in the logs section

Hope it helps you automate some NSX.

What does apply to mean in NSX Firewall?

When I first started using NSX I ran into this little problem.   What does apply to mean and how should I use it?

Background

I believe the background for the apply to is from physical firewalls.   They allowed you to apply rules to a specific interface.   Applying to an interface had the following effects:

  • Limit the number of rules that have to be processed
  • Allow specific fine-grained controls

Applying rules to specific interfaces had a few issues:

  • You had to have a good understanding of the network topology in order apply rules correctly
  • New interfaces may be missed by rules

You also had the ability to apply the rule to all interfaces that existed.   On the surface if you had enough hardware to apply the rules everywhere it worked great.  Tons of interfaces who didn’t need the rules now had them.    There are a few problems:

  • New interfaces would have no rules and all rules would have to be applyed to them
  • These rules exist only on a single firewall rule creation is specific to that firewall

NSX Firewall

The NSX firewall takes a similar approach to firewall application.  All firewall rules are created in NSX manager and stored inside the NSX manager database.   By default rules are applied to the “distributed firewall”.  This will apply the rules to all virtual machines vNIC, regardless of the virtual machines location.   This creates the same problem as applying on every interface, each vNIC will have a long list of rules to attempt to match.

This is where the apply to tag becomes interesting.   In order to explain I’ll use a simple example:

Two virtual machines: 172.16.0.2 on VNI 5000 and 172.16.20.2 on VNI 5002.

My default firewall rule set allows them to communicate without any issues.  Let assume I want to block all traffic between these machines so I create the following rule:

pic1

Source:  172.16.0.2 virtual machine

Destination: 172.16.20.2 virtual machine

Service: Any

Action: Block

Apply to: Distributed firewall (default)

 

Using Traceflow we can identify where it was blocked:

pic2

You can see clearly the default of distributed switch applied the drop action to the source.   This is really great because it limits the traffic on the physical wire.   Since the object is known as a managed object in NSX the rule is enforced as soon as possible.   If you have a physical entity that is not managed by NSX the rule will be applied upon the destination.   This is hard to prove because traceflow cannot provide visibility to physical entities.

What does apply to do?

Simply put it tells NSX where to apply the firewall rule.  Lets examine some of the options for my rule above:

  • Host
  • Cluster
  • Virtual machine
  • IP or Mac set
  • etc..

It provides the full list of objects that DFW rules can made with including dynamic sets and tags.   This is really powerful.   For the sake of this example lets apply the rule to the destination virtual machine instead of the DFW.

pic1

Using traceflow we can see the results:

pic2

My attempted connection was dropped at the destination where I applied the firewall rule.    You can also see how it between 7 and 8 the message left host 3 and went across my physical network to host 1 (black hole of visibility)

Why use the apply to feature?

  • Reduce the amount of rules applied to each vNIC
  • Enforce the rule at a specific location (think situations with VM overlap or rule overlap)

Apply to does add to the complexity of the environment and troubleshooting but can limit scope.   This is where careful planning and understanding of the environment can really help.   Arkin can help as well but that’s another days post.

Greatest tool for NSX!

I want to let you in on a little secret of NSX called Traceflow.   It was made available in the 6.2 release and I am in love with it.   In order to explain my love let’s do a history lesson:

History Lesson (Get off my lawn kids time)

Back in the old days (pretty much right now in every enterprise) you had a bunch of switches, routers and firewalls.   When a server was having a problem communicating with another server you had to trace its MAC address through every hop manually.   You might be lucky and use a SIEM to identify if a firewall was dropping the traffic.   Understanding each hop of the traffic is a pain.    It takes time and can be very complex in enterprise implementations.

Enter NSX

NSX does some complex routing, switching and firewalling.   Your visibility into the process in the past was articles like mine.   With traceflow you can prove your theory and identify data paths.    It still does not have visibility beyond the NSX world and into the physical.   Hopefully some day we will have that too.   Traceflow can get you pretty close.

Where is this traceflow of which you speak?

Login to vCenter, select networking and security and it’s on the right side most of the way down.   It allows you to select a source and a destination then inject packets.   The NSX components report back as the injected packet passes by allowing you to trace the flow of communication.

Show me some meat

Sounds good.  Lets assume we have two virtual machines 172.16.0.2 and 172.16.0.3 both on VNI (think vlan) 5000.   They are on the same ESXi host.   There are no firewall rules blocking traffic.   Here is the output from traceflow:

first_same_host

Look at that.  The injected packet came from 172.16.0.2 and hit the vNIC FW then was forwarded directly to 172.16.0.3’s vNIC firewall and into the machine.   This is simple and exactly what we expect.  Let do the exact same thing except move the second machine to another ESXi host:

second_diff_hosts

Now we have added the VTEP (virtual tunnel end point) connection between ESXi hosts.  VTEP communication is layer 3 between ESXi hosts creating a stretch of VNI 5000 between distances or right next to each other.

Neat meat but it really only shows layer 2 communication that’s easy

How about some routing then.  Two virtual machines 172.16.0.2 VNI 5000 and 172.16.10.2 VNI 5001.   Each on the same ESXi host:

Third_usingtwo_networks

Look at that now we see the logical router in the mix taking the traffic from Logical switch (LS-172.16.0) and routing it to Logical router LS-172.16.10.   Suddenly the flow of traffic is not a mystery.

What about if the firewall is blocking the traffic?

I assumed you would ask so here is a new firewall rule I added:

rule

And the traceflow:

after_fw_rule_added_1

Yep my packet was dropped and it tells me where and what rule number blocked it.

What is the only problem with traceflow?

That is does not show the traffic flow on my physical network.   This should be very simple given that all my traffic for NSX is routed we should not have complex layer 2 stretches or lots of vlans to ensure are in place.   It’s just routed communication that can start at top of rack with the correct design.

Why I took a pay cut to work at VMware

Warning:  This is a love rant for VMware.  You might want to skip if you are looking for the normal technical details of my blog.

Great title eh?  Really catchy and intended to get you to read and it’s true.    Two months ago I left a great job with IBM to work with VMware and took a pay cut to do it.   When you switch jobs there are lots of reasons money is cut part of the deal, you might leave a job for the following reasons:

  • Too much travel
  • Bad situation with management
  • No career growth potential
  • Money
  • A new challenge
  • etc…

The reasons are often a combination of these and other factors.   I left my job as a Senior VMware Architect for IBM to work as a Solutions Architect for VMware.   I wanted to take this blog post to explain why I took the job.   Some years ago I was a happy yet bored systems administrator.   My boss suggested that I attend an industry conference as a perk.   I went to VMworld.   I came back from the conference really excited about the future of VMware and the cloud.    I was invigorated by the energy and the vision of VMware’s executives.   I continued to learn about their technology and found it very refreshing in the market.   This lead me to focus my career away from Linux and into VMware technology.

Culture

I have taken two runs at VMware jobs in the past and did not make it.   Each time I was interviewed by multiple people.   Each of those people took the interview time to teach me new skills and help shape my thinking for success.   I love that attitude.   It’s a simple attitude that we are stronger as team instead of individuals.    Most of the company has this attitude and I love it.

Innovation

VMware’s technology continued to push the limits of traditional datacenter while proving real business value.    They have proven to be innovative through aquisitions (NSX) and internal research and development (vSAN).   I love this approach too many companies stop internal research and loose the innovative spirit, this is simply not true at VMware.

Career Growth

VMware takes career growth serious.  Two weeks after starting with the company my boss asked me for specific goals that can effect my bonus structure.   These goals are recorded and tracked and VMware is serious about enabling me to meet these goals.   Managers seem to be interested in retaining talent by supporting growth and interests.

New Challenges

Yes it’s true I am a challenge junkie.  It’s what has caused me to get two VCDX’s in two years.  Once a goal is on paper I am a nut case to achieve it.   Working for VMware as a Solutions Architect represents some new challenges and lots of learning which does keep me going for a little while.

It’s all great when you are new

I completely agree I am very young in the company so my view is narrow.   I believe the future to be very good for VMware and I am excited to join them on the journey.