27 diciembre 2013

Understanding networks in vCloud Director - Part 2/2

Read first "Understanding networks in vCloud Director - Part 1/2"
http://virtualshocks.blogspot.com.es/2013/09/understanding-networks-in-vcloud.html

This part 2/2 is about the "vApp networks" on VMware vCloud Director. This vApp networks connect the virtual machines in a vApp, it´s like configure a router in front a vApp to separate the vm´s from the rest of the vApp or the Cloud enviroment.

What VMware says "vCloud Director coordinates with vCloud Networking and Security Manager to provide automated network security for a vCloud environment. vCloud Networking and Security Edge gateway devices are deployed during the provisioning of routed or private networks. Each vCloud Networking and Security Edge gateway runs a firewall service that allows or blocks inbound traffic to virtual machines that are connected to a public access organization virtual datacenter network.    The vCloud Director web console exposes the ability to create five-tuple firewall rules that are comprised of source address, destination address, source port, destination port, and protocol."


When creating a vApp netork the options are:

-Direct: vApps coonect directly to the organization virtual datacenter network.
-Routed: new network where the router provides NAT and FW functions.
-Isolated: no connections outside de vApp, only inside vApp VM machines can communicate.
-Fenced: Identical virtual machines can exist in different vApps, the virtual router provides isolation and proxy ARP.

Before see some examples, take care with the Network Pool options as defined above:


...Trough the wizard:


..Trough the vApp diagram tab in vCloud Director GUI. This view is one of the best way to review the networking configuration issues clicking on a VM the paths are highlighted:



...Trough the Networks tab you can sleect the network type ant the NAT or FW options:



Let´s check some examples:

CASE1: where 2 Organizations keep comunicated with a External Network: vShield Edge routing and statics routes are necessary

-CASE1: where 2 Organizations keep communicated without NAT but where vShield Edge is necessary.
.



LINK:  vCloud Networking

22 octubre 2013

VCP5 + VCP-IaaS exam = VCP-Cloud ....examen aprobado!!!

Despues de unas cuantas semanas AWAY FROM KEYBOARD por un  maldito latigazo cervical.... retomo la actividad en el blog con muchas ideas en la cabeza y muchos post que publicar :-) 

Para empezar el ultimo post que se quedo en el tintero el pasado Octubre, por superar con exito el examen VCP-IaaS con 466 puntos de 500 posibles, el aprobado esta en 300 puntos de 85 preguntas en total :-) obteniendo de este modo la certificacion VCP-Cloud.





Recordemos que la certificación VCP5 ahora se llama VCP5-DCV; y existe una nueva certificacion 
VCP-Cloud que puedes obtener o bien presentándote al examen VCP-Cloud o bien aprobando el
VCP-IaaS si ya eres VCP5.



El examen VCP-IaaS cubre todo lo referente a vCloud: vCloud Director, vShield, vCenter Chargeback, etc

Recomendaciones para superar el examen con exito:

1-Descargar el Exam Blueprint para tener claro que entra y que no en el examen:

2- Realizar los dos cursos recomendados (que no obligatorios):
http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&a=one&id_subject=38932
http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www_edu&a=one&id_subject=48699

3- Montar un laboratorio con vCenter, vCloud Director y vCenter Chargeback.

4-Practicar "muuuucho" todo lo relacionado con el networking y los diferentes tipos de recursos que se pueden generar, que son el core del examen! Conceptos como External-Routed, vShield, VDC de Organizacion o vDC de Proveedor tienen que quedaros muy claros.


Recordatorio del roadmap nuevo de certificaciones:
Mas info: http://virtualshocks.blogspot.com.es/2013/09/new-vmware-certification-roadmap.html






11 octubre 2013

VMware & Oracle performance tips on NFS NetApp storage

     Los problemas de rendimiento son tediosos de resolver y no siempre tienen su origen donde apuntan las primeras hipótesis cuando alguien destapa la caja de los truenos y se queja del bajo rendimiento de algún servidor virtualizado.

    Es muy habitual que por desconocimiento de la virtualizacion, se la culpe siempre en primer termino de todos los males... saber conectarse a un vCenter y crear una maquina virtual "no" es saber de virtualizacion.

     Siempre debe realizarse un trabajo conjunto de todas las áreas implicadas. Me explico, un problema de performance de una maquina virtual no es un problema exclusivo de VMware o del virtualizador de turno, es un problema en el que interviene muchos elementos y departamentos como Storage, Networking, BBDD, etc.
 
    La suma de acciones suele generar buenos resultados al final de los analisis de problemas de rendimiento y todos los departamentos suelen enriquecerse de esas experiencias... asumir una mejora no es culparse de un error si no aportar valor y conocimiento ..al hilo de lo cual os recuerdo lo importante de documentar todo lo que podamos "el ser humano es el unico animal que tropieza dos veces en la misma piedra, porque la primera no lo documenta".



Vamos al turrón con 3 puntos básicos que aunque focalizamos en VMware, NetApp y Oracle lo podemos hacer extensible a cualquier entorno virtual, de almacenamiento y de bbdd:

1-Aplicar best practices de NetApp (consultar antes de aplicar con el fabricante):

Foco de atención sobre:
-Revisar configuración del Networking del NFS: uso de VLAN aislando para trafico NFS, misma subnet red NFS servidor y NAS, reducir saltos de red fisica entre el servidor y la NAS, habilitar jumbo frames a nivel de NFS, utilizar tarjetas 10GbE, realizar teaming (vifs & etherchannel) y revisar políticas de balanceo.
-Revisar el alineado de los FileSystem y VM´s.
-Aplicar recomendaciones del plugin NetApp integrado en el vCenter (consultar antes con fabricante)
-Revisar capacidad de IOps de agregados y controladoras (Monitorizar con OnCommand Core)
-Revisar configuración de los agregados: “añadir discos físicos y forzar reallocate”, verificar planificación de la deduplicacion, revisar estado thinprovisioning y espacio libre en NAS, revisar planificación de snapshots, revisar configuración de RAID óptima para mayor rendimiento de VM´s criticas.
- Estudiar opción crear agregados flashpools con SSD.
-Estudiar opción conexión por FC para VM´s criticas.

Documentación relacionada:
NetApp Technology - Monitorización OnCommand-Core
NetApp Storage Best Practices for VMware vSphere
NetApp Technology - Alineación de Máquinas Virtuales
NetApp, Best Practices for File System Alignment in Virtual Environments
NetApp, location of the mbrscan and mbralign tools
Clustered Data ONTAP NFS Implementation Guide
NetApp Data Compression and Deduplication Deployment and Implementation Guide
VMware vSphere 5 on NetApp Clustered Data ONTAP 8.1
NetApp and VMware Virtual Infrastructure 3 Storage Best Practices (uno de los TR mas populares de NetApp)



2-Aplicar best practices de VMware para NFS:

Foco de atención sobre:
-Reasignación de los datastores (vol): tamaños, y disposición en los aggr
-Redimensionar nº de VM´s en diferentes datastores y agruparlas por tipo de SO
-Reasignar VM´s criticas en vols independientes.
-Verificar estado VAAI para NFS (opciones limitadas)
-Revisar y optimizar shares de Disco, CPU y RAM de las VM´s criticas
-Crear varios vmkernel para NFS con diferentes ip´s y asignar diferentes nics de salida para repartir trafico. ( 1 vSwitch para NFS con 2 vmkernel > vmkernel01 con nic1 activa y nic2 pasiva, vmkernel02 con nic2 activa y nic1 pasiva)
-Configurar vDS que permita agregar varias nics fisicas en el mismo port group para NFS y obtener mayor capacidad de carga y/o balanceo (IP Hash, con limitaciones)
-Modificar parámetros avanzados de NFS

Documentación relacionada:
Increasing the default value that defines the maximum number of NFS mounts on an ESXi/ESX host
NFS with IP Hash Load Balancing
Best Practices for Running VMware vSphere on Network-Attached Storage (NAS)
Verify Hardware Acceleration Status for NAS
Performance Implications of Storage I/O Control–Enabled NFS Datastores in VMware vSphere 5.0
Best Practices for Performance Tuning of Latency-Sensitive Workloads in vSphere Virtual Machines
Performance Implications of Storage I/O Control-Enabled NFS Datastores
Load Balancing with NFS and Round-Robin DNS

3-Aplicar best practices de VMware para Oracle (BBDD):

Foco de atención sobre:
-No realizar overcommit de CPU y RAM (Reserva de RAM: virtual memory size = Oracle System Global Area + operating system)
-Uso de menos vCPU es más eficiente que uso de muchas vCPU
-Alineado de FS con la NAS
-Habilitar JumboFrames en VMware NFS y NAS
-Utilizar Paravirtual SCSI  (PVSCSI)  kb.vmware.com/kb/1010398

Documentación relacionada:
Running Business-Critical Applications on Oracle RAC, VMware and NetApp
Oracle® Databases on VMware vSphere™ 4 ESSENTIAL DEPLOYMENT TIPS
VMware vCenter Server 5.1 Database Performance Improvements and Best Practices for Large-Scale Environments
Oracle Databases on VMware VMware vSphere 5 RAC Workload Characterization Study (VMware VMFS)
Oracle Databases on VMware High Availability
Oracle Databases on VMware Best Practices Guide
Oracle Databases on VMware RAC Deployment Guide
Oracle Dev/Test on VMware vSphereand NetApp Storage Solutions Overview


...Mucho que leer y mucho que aprender :-)

09 octubre 2013

Do you know what NUMA is? really? - Part 2

Once we have clear the NUMA concept, let´s go review the use on VMware.

    NUMA topology is avaliable on VMware vSphere 5.0 and later,  hardware version 8 and later, and it´s enable by default when the virtual CPU is greater than 8, but it can be disabled or modify using advanced options (to enable vNUMA on 8 way or smaller VMs, modify the numa.vcpu.min setting) :


    When CPU affinity is enable on a virtual machine, it´s treated as a NON-NUMA client and gets excluded from NUMA scheduling. It means that the NUMA scheduler will not set a memory affinity for the virtual machine to its current NUMA node and the VMkernel can allocate memory from every available NUMA node in the system. It wil increase memory latency and probabbly the value  %RDY will get higher.



An example how NUMA (or vNUMA) works:


With NUMA enable, the Virtual Machine will get vCPU from the same NUMA node (it is, from the same socket)

But.... what´s about the best practices?

1-NUMA-nodes: take your time and the sockets features carefully because some physical cpu´s are composed with 2 underlying sockets. For example some ADM opteron wich have sockets with 12 cores are composed inside with 2 sockets and 6 cores each. It is, a server with 4 sockets 12 cores each.... are a 8 NUMA nodes and not 4.

2-The "default" config on a new VM is "cores per socket" equal to 1, it means that vNUMA is enabled and let the virtual topology to present the best performance to the VM.

3-If the number of "cores per socket" needs to be changed (for licensing purposes for example) the vNUMA will not apply the best config to the VM and will affect to performance. Only you choose a right combination with the vCPUs and cores per socket mirroring the physical NUMA topology on your server (review your NUMA-nodes config)

4-Cluster and DRS or vMotion: "One suggestion is to carefully set the cores per virtual socket to determine the size of the virtual NUMA node instead of relying on the size of the underlying NUMA node in a way that the size of the virtual NUMA node does not exceed the size of the smallest physical NUMA node on the cluster. For example, if the DRS cluster consists of ESXi hosts with four cores perNUMA node and eight cores perNUMA node, a wide virtual machine created with four cores per virtual socket would not lose the benefit of vNUMA even after vMotion. This practice should always be tested before applied. (VMware transcript) "





07 octubre 2013

Nueva certificacion: VMware Certified Associate - Cloud (VCA-Cloud) ...exam passed !!!


VMware: "Whether you simply want to become more conversant in virtualization technologies or ultimately want to be a recognized virtualization expert, VMware certification is now an essential step for your career"

Esta es una de las nuevas certificaciones de VMware que recientemente he superado con exito :-)
El examen es online a través de PearsonVUE y no requiere ningún curso obligatorio:


El primer paso es solicitar el poder hacer el examen desde la pagina de VMware: 

Una vez tengamos el OK de VMware accedemos a la web de PearsonVUE y ya nos encontramos con el examen "VCAC510: VCA-Cloud" (120mins & English) y solo hasta fin del mes de Octubre es gratuito con el codigo VCA13ICS.


...enjoy!



03 octubre 2013

VMware vCloud: reinstalacion de agentes ESXi de forma manual

     Una vez que asignamos un vCenter (cluster) a un pool de recursos de vCloud, lo primero que se instala de forma automática son los agentes de vCloud sobre los host ESXi que forman el cluster asignado.
     Si por algún motivo estos host se mueven de vCenter, de vCloud al que pertenecen o cualquier otra razón que implique el volver a conectarlos a un vCloud, nos podemos encontrar con un error en la opción de  Estado de "No se puede preparar el host", como vemos en la imagen:


Si pulsamos sobre el mensaje de error podemos ver mas info del error:


Primero verificamos si este host tiene algún agente instalado con el comando

esxcli software vib list | grep vcloud

y luego lo desinstalamos con el comando:

esxcli software vib remove -n vcloud-agent

Como vemos en la imagen:

Volvemos al host desde el vCloud y lanzamos de nuevo la opcion de "Prepar Host":


     Una vez finalizado el proceso ya debería quedar correctamente instalado el agente (podemos volver a comprobarlo lanzando el comando desde esxcli de nuevo). Pero en este caso, nos encontramos con otro error tipico "java.net.UnknownHostException" como vemos en la imagen:


     Este error se debe a configuracion DNS del vCloud director que no es la correcta. Nos conectamos al manager de vCloud desde la consola, vamos a la pestaña Network > Address y configuramos las DNS correctas:


Volvemos a lanzar el "Preparar host" ...y volia!



01 octubre 2013

vSphere Web Client login error: Failed to connect to VMware Lookup Service. SSL certificate verification failed (vCenter Appliance)

Si al hacer login desde el Web Client de vSphere nos encontramos con este error "Filed to connect to VMware Lookup Service https://ip:7444/lookupservice/sdk - SSL certificate verification failed.":






Nada tan sencillo como conectarnos a la consola de la VMware vCenter Server Appliance y poner en yes la opción de regenerar certificados desde la pestaña "Admin" > "Toggle certificate setting" > "Certificate regeneration enabled" > "yes":



Lanzamos un reboot desde la pestaña "System":


y ya podemos conectarnos sin problema:


...y volia!



27 septiembre 2013

VMware vCloud: desplegar vShield App sobre ESXi nested



Antes de entrar en materia, conviene tener claro que es la "nested virtualization": un ESXi en formato nested es un ESXi instalado en una maquina virtual que a su vez esta creada en un ESXi fisico, es decir, "un virtualizador virtualizado":


Partimos de la base de que ya tenemos vShield Manager desplegado y conectado al vCenter.

Vamos al turrón. La forma mas sencilla de desplegar vShield App como parte de la instalación de vCloud Director de VMware, es hacerla en modo gráfico:

-Desde la pestaña vShield del host conectados por el vSphere Client
-Desde la consola de vShield Manager por https (el usuario por defecto es "admin" y la clave "default")

Recordemos que se debe desplegar una vShield App por cada host del entorno, al igual que de vShield Endpoint y de vShield Data Security; pero solo una instancia de vShield Manager por vCenter.

Desde vSphere Client: seleccionamos un host y desde la pestaña vShield marcamos las App a instalar y pulsamos en Install:


Configuramos las ip´s y la red y el Datastore donde queremos que se despliegue la App:
NOTA: tener cuidado con desplegar la App en un host en el que resida vCenter! Podeis hacer un vMotion y luego retomar la instalacio ya que el proceso podria causar cortes de red.

  
Durante le proceso de instalación puede quedarse colgado en este punto, aunque no llega a salir un mensaje de error, podemos ver la consola de la App:



Como vShield App es una maquina virtual desde la consola podemos ver el mensaje de error:


Es aquí donde tenemos el gap para poder hacer funcionar maquinas virtuales que requieran x86-64 CPU instaladas sobre un ESXi virtual (nested): debemos habilitar en las settings del ESXi nested la opción "Expose hardware assisted virtualization to the guest OS" como vemos en la imagen:


Este parámetro se debe cambiar con la mv apagada, reiniciamos el ESXi nested, relanzamos la instalación de la App. Puede que nos encontremos con un mensaje de erro de la instalación y que tengamos que hacer un "uninstall" de la App para limpiar los restos de la instalación fallida y luego lanzar de nuevo el "install"

En otro post veremos como desinstalar agentes de vCloud en los ESXi de forma manual.

Una vez relanzado el install vemos desde la consola como ha terminado la instalación correctamente y nos pide ya directamente el login:


Ya tenemos la App de vShield y de Endpoint desplegada, la App de vShield Data Security debe hacerse por separado. ...voila!



Documentacion, mas de 180 paginas en este pdf ...mucho que leer: 

vShield Administration Guide 5.1 http://www.vmware.com/pdf/vshield_51_admin.pdf :
Indice:
     -vShield Manager 5.1
     -vShield App 5.1
     -vShield Edge 5.1
     -vShield Endpoint 5.1



25 septiembre 2013

Understanding networks in vCloud Director - Part 1/2

To easily use the program, you must have clear concepts of network types that we have in vCloud Director.


We have basically three types:

External Network: a portgroup on a switch (distributed, standard or Nexus). It is the vlan and ip segment (public or private) allocated physically

Organization Network: created automatically when create the Provider VDC, it´s only for organization use and could be one of three types:
1-direct connected to an external network
2-routed connected with a vShield Edge wich have two ip´s: one on the External Network side and one on the Organization Network to share the traffic between the vApp and the External Network
3-no connected to external network

vApp network: this network is created automatically when the vApp is created. The two functions are:
-No connected to Organization Network
-Routed connected to Organization Network


....On next post Part-1/2 will review more concepts like Fencing, Isolating or Routed networks.

17 septiembre 2013

HP Smart Update Manager 6.0 (HP SUM 6)


 HP Smart Update Manager is a product which updates firmware and software on HP ProLiant servers, and firmware on HP Integrity servers. HP SUM has a browser-based GUI; as well as a scriptable interface using legacy command line interface, inputfile, and console (technology preview) modes. 

User Guide:
http://www.hp.com/support/HP_SUM_UG_en

Technical White Paper:
http://h20195.www2.hp.com/V2/GetDocument.aspx?docname=4AA4-6947ENW&cc=us&lc=en

Download "HP Smart Update Manager version 6.00 - ISO":
http://h17007.www1.hp.com/us/en/enterprise/servers/products/service_pack/hpsum/index.aspx




11 septiembre 2013

Do you know what NUMA is? really? - Part 1

This post is about NUMA concept, because many people speaks about NUMA and don´t know "really" what is or how to use it.

By the way, NUMA is refered to the server platforms with more than one system bus and dedicates different memory banks to different processors. 

See an example with 2 bus (2 sockets) which  access to their own memmory banks internally and  access the rest memmory banks when interleavng is disable on the BIOS :


Each CPU (socket) + local memory = NUMA node

In the past, there were servers with only one bus and the CPU was increased in Ghz  more and more, and the memmory consumtion grow up; it´s teh cause nowadays the servers have more than one bus and you must install the memmory on banks pairing the bus.

If the memory is not populated correctly and distributed equally across the nodes the O.S. stop responding and display a purple screen (PSOD) with the following NUMA node error message:



In a NUMA architecture, processors may access local memory quickly and remote memory more slowly. This can dramatically improve memory throughput as long as the data are localized to specific processes (and thus processors). On the downside, NUMA makes the cost of moving data from one processor to another, as in workload balancing, more expensive.  The high latency of remote memory accesses can leave the processors under-utilized, constantly waiting for data to be transferred to the local node, and the NUMA connection can become a bottleneck for applications with high-memory bandwidth demands.

However, an advanced memory controller allows a node to use memory on all other nodes, creating a single system image. When a processor accesses memory that does not lie within its own node (remote memory), the data must be transferred over the NUMA connection, which is slower than accessing local memory. Memory access times are not uniform and depend on the location of the memory and the node from which it is accessed, as the technology’s name implies.

Memory interleaving refers to the way the system maps its memory addresses to the physical  memory locations in the memory channels and DIMMs. Typically, consecutive system memory addresses are staggered across the DIMM ranks and across memory channels in the following manner:

>Rank Interleaving. Every consecutive memory cache line (64 bits) is mapped to a different DIMM rank.
>Channel Interleaving. Every consecutive memory cache line is mapped to a different memory channel.

disabling Node Interleaving = NUMA active
At least, NUMA options is only avaliable on Intel Nehalem and AMD Opteron 


Next post "Part 2" will review the NUMA use on VMware.

10 septiembre 2013

Sizing Storage

Once you get a clear idea who-are-who on the complicated I/O latency world (review other two post here and here).....you need to think about the general rules and best practices to sizing storage for your IOPS needs.


GAVG (Guest Average Latency) total latency as seen from vSphere
KAVG (Kernel Average Latency) time an I/O request spent waiting inside the vSphere storage stack.
QAVG (Queue Average latency) time spent waiting in a queue inside the vSphere Storage Stack.

DAVG (Device Average Latency) latency coming from the physical hardware, HBA and Storage device.

Bad performance if:

High Device Latency: Device Average Latency (DAVG) consistently greater than 20 to 30 ms may cause a performance problem for your typical application.
High Kernel Latency: Kernel Average Latency (KAVG) should usually be 0 in an ideal environment, but anything greater than 2 ms may be a performance problem.

And now, what about the storage?

The typical workload show on this picture, is for the most of the virtual machines, but, othre types like DDBB are so different and will need special configuration like dedicated raidgroups, ssd cache pools, etc.

It´s so important when planning and sizing a infrastructure to bear in mind not only the storage, it´s a mix with the raid conf, diskt typs (ssd, sas, sata), the storage protocol (FC, FCoE, iSCSI or NFS), networking and conf (cabling, switch, vlans conf), vmware conf (storage adapters, datastores conf (vmfs, rdm, advanced parameters) and so on.


09 septiembre 2013

The HP Power Advisor utility (free)

    HP has created the HP Power Advisor utility that provides accurate and meaningful estimates of the power needs for HP ProLiant and Integrity servers. ´

     The HP Power Advisor utility is a tool for calculating power use of the major components within a rack to determine power distribution, power redundancy, and battery backup requirements for computer facilities. Power Advisor allows you to configure each individual server or node. You can then duplicate the server configuration as often as necessary to populate an enclosure, and then duplicate it to populate a rack. The result is that you can build a complete data center quickly by duplicating a rack.

Easy to install, easy to use (drag & drop), friendly interface....a great free tool!!!







05 septiembre 2013

Visual ESXTOP GUI for VMware

The VMware visualESXTOP is a tool from the VMware Labs to view in windows-mode (hehehe) the results of the cli command ESXTOP and with a friendly option to generate some graphics

1- Download it from: http://labs.vmware.com/flings/visualesxtop  and unzip files:



2- Change PATH on the System Properties - Advanced System Settings - Advanced - Environmet Variables - System Variables: "PATH=C:\Program Files (x86)\Java\jre6\bin;
(Be carefull with this changes , and backup all the the info you modify for a restore case)


3- Start from CMD or click on the "visualEsxtop.bat" and connect to a ESX host or to a vCenter Server from Menu; File - Connect to Live-server, or start it from powercli:




This will open a new window with the GUI:

Now, you can review al the metrics from the diferents tabs:

...And generate some graphics , for example de% CPU ready:





.